When making code changes related to the metadata parsing, it is useful to
see how the internal format has changed by seeing the differences in the
dump files. Those files are currently in the binary .pickle format. This
just straight converts them to YAML, which is a text format, so that normal
diff tools work to see changes.
The dump files are named .yaml instead of .yml since .yml is used for hand-
edited YAML files for fdroiddata/metadata, while these dump files here are
a human readable form of a Python pickle.
JSON and YAML are very closely related, so supporting both of them is
basically almost no extra work. Both are also closely related to how
Python works with dicts and pickles. XML is a very different beast, and its
not popular for this kind of thing anyway, so just purge it.
Though the YAML people recommend .yaml for the file extension, in Android
land it seems clear that .yml has won out:
* .travis.yml
* .gitlab-ci.yml
* .circle.yml
* Ansible main.yml
This simplifies usage, goes from
build['flag']
to
build.flag
Also makes static analyzers able to detect invalid attributes as the set
is now limited in the class definition.
As a bonus, setting of the default field values is now done in the
constructor, not separately and manually.
While at it, unify "build", "thisbuild", "info", "thisinfo", etc into
just "build".
This simplifies usage, goes from
app['Foo']
to
app.Foo
Also makes static analyzers able to detect invalid attributes as the set
is now limited in the class definition.
As a bonus, setting of the default field values is now done in the
constructor, not separately and manually.
For a bit repo like f-droid.org, it makes sense to standardize on a single
format for metadata files. This adds support for enforcing a single data
format, or a reduced set of data formats. So f-droid.org would run like
this if it changed to YAML:
accepted_formats = ['txt', 'yaml']
Then once everything was converted to YAML, it could look like this:
accepted_formats = ['yaml']
YAML is a format that is quite similar to the .txt format, but is a
widespread standard that has editing modes in popular editors. It is also
easily parsable in python.
The .pickle for testing is a lightly edited version of the real metadata
for org.videolan.vlc:
* comments were removed
While the current text metadata format is good for human readability and
editability, it is difficult to produce and parse using code. XML is a
widespread standard format for easy automatic parsing and creating, while
having decent human readability.
The .pickle for testing is a lightly edited version of the real metadata
for net.osmand.plus:
* comments were removed
* "NonFreeNet" was added as an AntiFeature
This is a test to cover future modifications of the .txt metadata parsing.
The pickle file was generated by just dumping the current parsed metadata,
so this test will always succeed if the parsing is not changed.