Many OpenOffice.org pages are published in a multi sub-domain structure. See http://wiki.services.openoffice.org/wiki/Infrastructure_Overview for details. With the exception that the main site is now hosted on kenai, this is probably accurate.
Active address list:
Project name |
URL |
Hosted at |
---|---|---|
About |
Kenai |
|
API |
Kenai |
|
Bugzilla |
Kenai |
|
Development |
Kenai |
|
Distribution |
Kenai |
|
Documentation |
Kenai |
|
Download |
Kenai |
|
Main page |
Kenai |
|
Marketing |
Kenai |
|
Native pages list |
Kenai |
|
Projects list and individual addresses (146 projects) |
Kenai |
|
Support |
Kenai |
|
|
|
|
Extensions |
OSUOSL |
|
Forums |
Oracle |
|
Templates |
OSUOSL |
|
Wiki |
Oracle |
also see OpenOffice Domains
A sitemap of the webpages located on kenai.com is add'ed above. Same NLC projects are missed cause tecnical issues. (e.g. es.oo.o)
Archive create
Possible:
- Web content checkout via SVN URL.
In the AOOo project in https://svn.apache.org/viewvc/incubator/ooo/trunk/tools/dev/ is a script and web project list that automates checkout and update.
Look for fetch-all-web.sh and web-list.txt. The text file needs to be edited. The script performs svn update on existing project directories to save time.
Here is the how to do it individually.
Syntax:
svn co https://svn.openoffice.org/svn/<$projectname>~webcontent your_local_dir
Example:
svn co https://svn.openoffice.org/svn/download~webcontent download --> to get all website content from the download project
Do it analog with the other projects. - Wiki: database dump (Clayton Cornell is able to help with this). Clayton is no longer an available resource. TerryE has dumps and full VM copies of both the wiki and forums. Use me (TerryE) as a source (subject to access approvals).
- Bugzilla: I hope, ORACLE will provide a database dump if not, we can use XML export. Bugzilla can import this XML's.
- Forums: As I know we have admins of the OOo user forums in our group, they can make a dump of the database via the PHPbb admin interface.
- Extensions and Templates: We really need to backup this. AFIAK the servers of this services are not hosted by ORACLE, they are hosted at OSUOSL.
- Use wget
Note: I (rbircher) have allready a script to make a serie checkout of all projects, the only thing that I need is a .txt file who lists all project names (line break separated)
Todo plan
- Create full sub-domains list (Substantial progress)
- Create archive (can do it in a people.apache.org account. (development, documentation, download, projects, and www take 2.7GB.)
- Determine how to deal with current "projects" (many!)
- Selecting needed content
- Move contents to new page