Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This is a brief, but hopefully useful documentation of Archiva. I've just put this up to organize my thoughts while I am doing MRM-409 and it might be of help to other developers too.
Repository Scanning

...

Classes

Below are some of the important classes of Repository Scanning:

Class

Implements

What Does it do?

DefaultRepositoryScanner

RepositoryScanner

Makes use of plexus-utils' DirectoryWalker to scan the repository

RepositoryScannerInstance

DirectoryWalkListener

Listener that sets the trigger to start the consumers.

RepositoryContentStatistics

generated by modello

Contains the stats (duration, no. of files discovered, etc.) of the respository scan.

TriggerBeginScanClosure

Closure (commons-collections)

Signals to the consumer(s) that the repository scanning will begin.

DefaultBidirectionalRepositoryLayout

BidirectionalRepositoryLayout

Default bidirectional layout used by m2 repositories.

ArchivaArtifact

 

Archiva artifact object

ArchivaArtifactModel

generated by modello

Contains the detailed attributes of an archiva artifact sa groupId, artifactId, version, checksums, etc.

FileContentRecord

LuceneRepositoryContentRecord

Contains the contents of the artifact to be indexed.

Repository Content Consumers (KnownRepositoryContentConsumer)

This is configured in archiva.xml, under <repositoryScanning>.

Class

Role Hint

What does it do?

ValidateChecksumConsumer

validate-checksum

Validate checksum files.

LegacyConverterArtifactConsumer

artifact-legacy-to-default-converter

Converts legacy artifacts to m2 artifacts.

ArtifactMissingChecksumConsumer

create-missing-checksums

Creates checksum if it is missing.

AutoRemoveConsumer

auto-remove

Removes files in the repository being scanned if the file type matches any of the configured file types to be removed.

AutoRenameConsumer

auto-rename

 

ArtifactUpdateDatabaseConsumer

update-db-artifact

Save the artifact (in the form of ArchivaArtifact) to the database.

IndexContentConsumer

index-content

Processes the artifact's content into a FileContentRecord that is used for indexing.

RepositoryPurgeConsumer

repository-purge

Removes old snapshots from the repository either by the number of days old or by the retention count. (See Repository Purge section below)

The Process

...

and

...

Indexing

...

  • User types the query string and hits the Search button.
  • Archiva then searches its indices for the query string and returns the search results.
  • The user can click on an artifact to browse it. Actually, what the user browses is the pom. At the back-end, Archiva checks if the project model is already in the database. If it is not, then archiva constructs the ArchivaProjectModel object and saves it to the database.1 Once it is already in the database, the pom info or artifact is displayed.

...

Finding an Artifact

...

...

Registry Listeners

A RegistryListener (plexus-registry) is an interface that receives notification for every change in the Registry. There are a handlful of classes in Archiva that implements this and performs some processes every time there's a change in the configuration.

...

Remove old snapshots from the managed repository based on a criteria: By Number of Days Old and By Retention Count. There is also the option to enable or disable the cleanup of released snapshots from the repository.

Classes

Below are the classes for Repository Purge:

...

  1. To enable repository purge, add "repository-purge" in the <knownContentConsumers> section of the archiva.xml. The RepositoryPurgeConsumer will be executed when repository scanning is started.
  2. The user can choose whether to purge the repository of snapshots older by a specific number of days OR to purge the repository of snapshots but retaining a specific number of that snapshot. This can be configured by specifying specific values in the "Repository Purge By Days Older Than" or "Repository Purge By Retention Count" fields in the Add/Edit Repository page. By default, these has "100" and "2" values respectively. If "Repository Purge By Days Older" is NOT EQUAL TO 0 (zero), then that would be the criteria used for the repository purge. Otherwise, if it is EQUAL TO 0 (zero) then the "Repository Purge By Retention Count" criteria is used instead.
  3. To enable/disable the cleanup of released snapshots in the repository, the user can opt to check or uncheck the "Delete Released Snapshots" option in the Add/Edit Repository page.

The Process

  1. RepositoryPurgeConsumer is executed during repository scanning. Only those "artifact" file types are consumed (<fileType> with "artifact" id in archiva.xml).
  2. The consumer will check the if the deleteReleasedSnapshots field (in RepositoryConfiguration) is enabled. If so, then it will execute CleanupReleasedSnapshotsRepositoryPurge.
    • CleanupReleasedSnapshotsRepositoryPurge will remove all released snapshots from the repository. For example: 1.2, 1.3-SNAPSHOT and 1.3 exists for artifactX in the repo. 1.3-SNAPSHOT will be removed since 1.3 already exists (therefore it has already been released). All metadata files are updated based on the remaining versions of the artifact in the repository.
  3. The consumer will also check the value of the daysOlder field in the configuration of the repository being scanned. If it is not set to 0 (zero), then the consumer will execute the DaysOldRepositoryPurge. Otherwise, it would execute the RetentionCountRepositoryPurge.
    • DaysOldRepositoryPurge checks when the discovered SNAPSHOT artifact was last modified and if it is older by X (daysOlder value) days then the artifact will be removed from the repository.
    • RetentionCountRepositoryPurge on the other hand, checks if the number of "unique versioned" snapshot artifacts in the directory where the discovered artifact resides is LESS THAN the retentionCount value. If the contents are greater than the retention count, then the oldest snapshot artifact (including associated poms, source jars, javadoc jars, etc.) are removed until the total # of unique versioned artifacts is EQUAL TO the retention count. For example, the discovered artifact is ../artifactX/2.0-SNAPSHOT/artifactX-2.0-SNAPSHOT.jar. RetentionCountRepositoryPurge will get a list of the files in ../artifactX/2.0-SNAPSHOT directory. Lets say, ../artifactX/2.0-SNAPSHOT has the ff. contents: artifactX-2.0-1111111-1.jar, artifactX-2.0-1111111-1.pom, artifactX-2.0-1111100-2.jar, artifactX-2.0-1111100-2.pom, artifactX-2.0-SNAPSHOT.jar and artifactX-2.0-SNAPSHOT.pom. If the retention count is 2, then artifactX-2.0-1111111-1.jar and artifactX-2.0-1111111-1.pom are removed from the repo and the 2 newest artifacts (and its associated files, in this case the poms) are retained.
  4. For all these RepositoryPurge implementations, all removed artifacts from the repository are also removed from the database.1

...