Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Document up-stream data model

For each entity, event, and relationship on the short-list, find links to documentation on exactly what the data looks like in JSON or XML form.

Identify Important Relationships

...

Document up-stream data model

For each entity, event, and relationship on the short-list, find links to documentation on exactly what the data looks like in JSON or XML form.

 

Find the best Activity Streams Actor or Object type for each upstream Entity type 

...

Make a short list of providers to write in the initial implementation, identifying what describing the use case they support, identifying what inputs they will require to start and what type(s) of documents they will provide.

  1. OrganizationProvider
    1. Use Case: Given a finite list of organization IDs, pull latest details about each. 
      1. Input: List of organization ids
      2. Output: github:organization
    2. Use Case: Given a finite list of user IDs, pull latest details on each organization any of them belong to.
      1. Input: List of user ids
      2. Output: github:organization
  2. RepositoryProvider
    1. Use Case: Given a finite list of organization IDs, pull latest details on each public repository belonging to any of them.
      1. Input: List of organization ids
      2. Output: github:repository
    2. Use Case: Given a finite list of user IDs, pull latest details on each public repository belonging to any of them.
      1. Input: List of organization ids
      2. Output: github:repository
  3. UserProvider
    1. Use Case: Given a finite list of user IDs, pull latest details on all of them.
      1. Input: List of user ids
      2. Output: github:user
  4. UserFollowingProvider
    1. Use Case: Given a finite list of user IDs, pull latest details on each user any of them are following, maintaining the follow connection.
      1. Input: List of user ids
      2. Output: github:follow (github:user, github:user)
    2. Use Case: Given a finite list of user IDs, pull latest details on each user any of them are following, maintaining the follow connection.
      1. Input: List of user ids
      2. Output: github:follow (github:user, github:user)

Look for a high-quality java library.

Search github / stack-overflow / google and see if you can find a high-quality java library to simplify the code involved in getting the data.

It should be:

  1. Publicly available source code
  2. FOSS-friendly license (Apache 2.0, MIT, etc...)
  3. In maven central (findable with a public maven repository
  4. Active

First look on web site of data source

https://developer.github.com/libraries/

No official java library

Look at list of third party options

#1) GitHub Java API (org.eclipse.egit.github.core)

Publicly available source code - https://github.com/eclipse/egit-github/tree/master/org.eclipse.egit.github.core

 

Eclipse Public License v1 (acceptable) - https://github.com/eclipse/egit-github/blob/master/org.eclipse.egit.github.core/about.html

https://www.apache.org/legal/resolved says EPL can be used but only binaries - that's fine we around going to redistribute the source.

In a public maven repository - http://search.maven.org

...

/#search%7Cga%7C1%7Cegit-github

Active

...

- The last release was in 2013, that's not ideal, but there are still people committing on the project and working on issues

#2) GitHub API for Java

Publicly available source code - http://github-api.kohsuke.org/

MIT License (acceptable) - http://github-api.kohsuke.org/license.html

In a public maven repository - https://oss.sonatype.org/#nexus-search;quick~github-api

Active - There have been 80 releases, thats a great sign.

#3) http://github.jcabi.com/

Publicly available source code - https://github.com/jcabi/jcabi-github

License - not an open license.  Deal-breaker.

#2 looks like best bet.

Document how each provider class will acquire data.

Make notes on how you plan to use the java library to get the source data on your shortlist. into each provider.

  1. OrganizationProvider
  2. RepositoryProvider
  3. UserProvider
  4. UserFollowProvider

Figure out permissions

If the data source requires special permissions to get at the dataset you are looking at, figure out how to get those permissions and document the process.

...