The following is a model of an ideal software structure. Whether or not socrates will use it or a variation is by no means decided.

 

This is basis for a development discussion, where focus should be on community, testability and expandability.

 

# A picture says more than 1000 words

The archives contains numerous mails to this theme, often with terminology misunderstanding.

 

      +---------- Consumers ---------+  

      |         +--- convert -----+  |   

      |       +--- editor ------+ |  |

      |     +--- dfutil ------+ |-+  |

      |   +--- html view ---+ |-+    |

      |   |                 |-+      |

      |   +-----------------+        |

      +------------------------------+

                     /\

                     ||

                     \/

      +--------- DocFormat ----------+

      |        +--- Helper --------+ |

      |      +--- Management ----+ | |

      |    +--- Portability ---+ |-+ |

      |  +--- DataCapsule ---+ |-+   |

      |  |                   |-+     |

      |  +-------------------+       |

      +------------------------------+

                     /\

                     ||

                     \/

      +---------- Filters -----------+

      |           +--- XML -----+    |

      |         +--- ODF -----+ |    |

      |       +--- PDF -----+ |-+    |

      |     +--- HTML ----+ |-+      |

      |   +--- OXML ----+ |-+        |

      | +--- LATEX ---+ |-+          |

      | |             |-+            |

      | +-------------+              |

      +------------------------------+

 

# Description

The following is a description of the modules, which are foreseen but not all made. The description is not to be thought of a developers blue print for programmning, but merely to understand the title.

 

 

## Consumers

Most of the consumers will hopefully be created outside the socrates project. The Socrates project will supply a couple of examples.

 

### convert

A command line utility that can convert between all formats.

 

### editor

A Qt based editor, that can edit the Datacapsule

 

### dfutil

The black box and white box unit module. It has all white box unit tests compiled in, and dump result files for black box unit test.

 

Remark, dfutil currently exist, and need only minor modifications.

 

 

### html view

A html viewer, basically firefox or internet explorer.

 

We simply provide the documentation how to use it.

 

 

## DocFormat

DocFormat is the kernel of Socrates, the part which everything else turn

around. DocFormat is already available, and only need some cleaning up.

 

### Helper

Helper is a library within DocFormat, that offers speciality functions to all

other parts of Socrates. We do not want to use 3rd party libraries all over

the source, so the functions we make available from a 3rd party library is

always covered by functions in Helper.

 

Helper guarantees that we can freely exchange libraries (e.g. glib instead of

zlib or xalan-c instead of libxml). The freedom is important since these

libraries might change license to a stricter one, which limit our distribution

possibilities.

 

Helper also guarantees that the rest of Socrates does not need to care about

library versions and differences.

 

### Management

Management is merely a set of functions, that tie the system together.

 

Some examples:

- registration of file suffix to filter type,

- open/close files

- activate/deactivate a filter

 

 

### Portability

Portability is the place where all platform differences are hidden. Socrates

source are only allowed to use ansi standard functions, if a OS specific

function like dirSearch is needed, portability provides a cover function.

 

Portability does for platforms what helper does for 3rd party libraries.

 

 

### DataCapsule

DataCapsule is the in-memory storage of documents. It contain functions to

traverse the document, copy/move/add/delete atoms in the document.

 

When a filter read a document it stores the content as atoms in the

DataCapsule. Likewise when a filter wants to write a document, it traverses

the document atom by atom and writes the file.

 

In theory the way atoms store data is unknown to the filters, who only have

access functions, but for practical purposes (avoid making tons on new header

files) we need to define some "standard" format to use, e.g. css for styles.

 

The exact definition of atoms, is not yet defined and will only be so after

some long discussions.

 

However it is important that the DataCapsule give true format indepence,

because it is never used as a target file format. We might choose to continue

using css/html internally, but that will most likely be expanded with our own

tags and therefore different from what the HTML filter writes.

 

## Filters

Filters are format converters, a filter convert between a specific fileformat

and the DataCapsule.

 

A filter contain a set of predefined functions

- read file

- write file

- register file extensions

- get statistics

 

A filter lives independently and can therefore be blackbox tested independent.

This is important since it simplifies test heavely because we do not need to

test all combinations of filters.

 

### XML

Is a test filter, that basically dumps the DataCapsule. This allows dfutil to

compare old xml file with new xml file, to check if a test case caused

differences.

 

### ODF

This filter exist, but need to be extended for

spreadsheet/presentation/drawings.

 

### PDF

There are programs that can generate pdf files, so the filter is not high

priority, just nice to have.

 

### HTML

Remark even if Socrates end up using HTML internally, we should still have a

HTML filter.

 

Think of the following situation. A user reports a rendering bug, we change

the DataCapsule so the HTML is correct, user is happy, but by doing that we

broke the convert between odf and oxml.

 

If we have a filter, we would do the change in the filter, keep the internal

representation, and have no side effects.

 

### OXML

This filter exist, but need to be extended for

spreadsheet/presentation/drawings.

 

### LATEX

This filter exist, but need to be finished

 

# How to

This is a prioritized list of how to get from where to are to where we would

like to be.

 

## Rearrange source

We will make new main directories (consumers, filters), and move the code into

these directories.

 

This helps new people understand which part to attack, currenctly DocFormat

contain multiple libraries for multiple purposes. It is e.g. not evident that

"word" contain the oxml filter.

 

## Remove cross references

Currently any file potentially use functions in any other file (its not that

bad, but we dont have a call graph).

 

Function which are used multiple places should move to "helpers", so that its

very clear which functions are common, and which not.

 

## Isolate 3rd party and OS

A special variant of cross references is the use of 3rd party header files.

Changing that will require new code in helpers.

 

OS fnctions seems not isolated in the current source, isolating them will make

porting easier.

 

If we have a portability library, we can test that once on all platforms, and

knowing that works means we dont need to test e.g. filters on all platforms.

 

 

## Add interfaces

Having a clean code structure, allows us to add interfaces.

 

Especially filters, calls for a C++ class, from which the single filter

inherit, doing so makes management a lot easier to program, because its only

when allocating the filter the type is known.

 

## Define data capsule

This is the last part and for sure the biggest discussion. For the moment this

is left for a future discussion.

 

 

  • No labels