Page 1 of 1

Ship machine readable API information

Posted: Tue Dec 01, 2020 11:16 pm
by Mooeing-747
Apologies if this isn't the right place to request this, wasn't sure where it fit best.

I've noticed several tools (like autocompletes for various IDEs) that rely on scraping the documentation of the Lua API and prototypes pages and then do all sorts of html scraping magic to extract things like structures, methods, types, descriptions, etc. As with any scraping solution, this is brittle and imperfect. For example, I've noticed in at least one autocomplete plugin that some of the descriptions are mangled or just not present.

I'm working on a project that requires the same sort of information about the API, so I'm about to write my own parser for this information :oops:

I'm assuming that the documentation is generated from the raw type information extracted from the codebase, as opposed to the html files being edited by hand. If that's the case, would it be possible to bundle/post the raw information as well as the human readable api docs? Then programs like autocompletes could parse the definition files directly and skip all of the web scraping.

I'm not asking for any specific format, just something amenable to being read by code.

Re: Ship machine readable API information

Posted: Thu Dec 03, 2020 11:35 am
by Bilka
Could you link one of the tools that scrapes the prototype pages? I'm curious what their result is.

For lua-api.factorio.com, there is https://github.com/spiwn/FactorioApiScraper which is open source, so you don't need to make your own parser.

The prototype pages are edited entirely by hand.

Re: Ship machine readable API information

Posted: Thu Dec 03, 2020 11:53 am
by eradicator
(Small sidenote: It would be quite awesome if the prototype pages were shipped (i.e. in the stand-alone zip file) at all (in human readable form) so that they're accessible offline. With the added bonus that one could directly compare the prototype structure of two game versions - making migration of old mods or supporting old versions easier when things change.)

Re: Ship machine readable API information

Posted: Thu Dec 03, 2020 12:00 pm
by Bilka
eradicator wrote:
Thu Dec 03, 2020 11:53 am
(Small sidenote: It would be quite awesome if the prototype pages were shipped (i.e. in the stand-alone zip file) at all (in human readable form) so that they're accessible offline. With the added bonus that one could directly compare the prototype structure of two game versions - making migration of old mods or supporting old versions easier when things change.)
We all have dreams :) This would require that the doc is already updated to the newest version when that version is released. That is rather far from reality; the doc updates take me somewhere from a few weeks to a few months per major version.

Re: Ship machine readable API information

Posted: Thu Dec 03, 2020 12:19 pm
by eradicator
Bilka wrote:
Thu Dec 03, 2020 12:00 pm
eradicator wrote:
Thu Dec 03, 2020 11:53 am
(Small sidenote: It would be quite awesome if the prototype pages were shipped (i.e. in the stand-alone zip file) at all (in human readable form) so that they're accessible offline. With the added bonus that one could directly compare the prototype structure of two game versions - making migration of old mods or supporting old versions easier when things change.)
We all have dreams :) This would require that the doc is already updated to the newest version when that version is released. That is rather far from reality; the doc updates take me somewhere from a few weeks to a few months per major version.
Hm. I wasn't aware that it took that long. But my dream is just having any sort of official offline prototype reference (yes, my internet sucks and if the api doc wasn't available offline i'd never have started modding), even if it takes a month it's still useful for many month after that! Currently i have to use httrack to try and scrape the wiki, which is a lengthy and unsatisfying process when all i want is the prototype pages (not all the other wiki stuff), so even if i had to wait it'd still be a major improvement for me if there was a simple current-snapshot zip download. And presumably if such a standartized snapshot format existed the development of auto-parsers for it would also be easier.

Re: Ship machine readable API information

Posted: Thu Dec 03, 2020 3:56 pm
by Squelch
I find myself in the same boat on occasions, and would love to have an offline version too, even if it took some time to complete.

For an automated solution, there's always the VScode debug plugin from justarandomgeek that also includes intelisense autocomplete and comments from the online documentation. (at least I think it's that plugin, though it's not mentioned specifically)

VScode is cross platform btw.

Re: Ship machine readable API information

Posted: Thu Dec 03, 2020 5:06 pm
by ptx0
eradicator wrote:
Thu Dec 03, 2020 12:19 pm
Bilka wrote:
Thu Dec 03, 2020 12:00 pm
eradicator wrote:
Thu Dec 03, 2020 11:53 am
(Small sidenote: It would be quite awesome if the prototype pages were shipped (i.e. in the stand-alone zip file) at all (in human readable form) so that they're accessible offline. With the added bonus that one could directly compare the prototype structure of two game versions - making migration of old mods or supporting old versions easier when things change.)
We all have dreams :) This would require that the doc is already updated to the newest version when that version is released. That is rather far from reality; the doc updates take me somewhere from a few weeks to a few months per major version.
Hm. I wasn't aware that it took that long. But my dream is just having any sort of official offline prototype reference (yes, my internet sucks and if the api doc wasn't available offline i'd never have started modding), even if it takes a month it's still useful for many month after that! Currently i have to use httrack to try and scrape the wiki, which is a lengthy and unsatisfying process when all i want is the prototype pages (not all the other wiki stuff), so even if i had to wait it'd still be a major improvement for me if there was a simple current-snapshot zip download. And presumably if such a standartized snapshot format existed the development of auto-parsers for it would also be easier.
I might be retarded, or I might be missing something, but isn't there a Github repository from wube that contains the prototype changes between versions? and you can download and 'diff' this repository's commits?

maybe they can put the wiki on github as well, and so it can be cloned?

Re: Ship machine readable API information

Posted: Thu Dec 03, 2020 5:30 pm
by Squelch
ptx0 wrote:
Thu Dec 03, 2020 5:06 pm
I might be retarded, or I might be missing something, but isn't there a Github repository from wube that contains the prototype changes between versions? and you can download and 'diff' this repository's commits?

maybe they can put the wiki on github as well, and so it can be cloned?
You make some good points.

Re: Ship machine readable API information

Posted: Thu Dec 03, 2020 7:10 pm
by Mooeing-747
Bilka wrote:
Thu Dec 03, 2020 11:35 am
Could you link one of the tools that scrapes the prototype pages? I'm curious what their result is.

For lua-api.factorio.com, there is https://github.com/spiwn/FactorioApiScraper which is open source, so you don't need to make your own parser.

The prototype pages are edited entirely by hand.
Examples are autocompletes for:

- atom: https://github.com/Yokmp/atom-autocomplete-factorio
- intellij: https://github.com/knoxfighter/intellij ... completion
- VS code: https://github.com/simonvizzini/vscode- ... tocomplete


I'm aware of the various parsers (each project I linked above has some form of a custom parser), and this is somewhat my point. There's demand for this information, and at least 4 or 5 different projects to get it, with vary levels of completeness/accuracy.

For my project, I've found the parsers don't always handle the edge cases well or include all the information I want. For example, the parser you linked doesn't handle optional at all or table args well - it says "this is a table" but the table members are only in the doc string.

A slightly different example is the values of the `define` constants aren't available online anywhere, so some parsers naively assign them in order to an enum. But some defines are strings, some are non-monotonically-increasing-ints, etc. and this is the type of information I'd like. (I've taken to dumping them to the log with serpent to work around this)

That's a shame about prototype pages. I feel for you :(

Re: Ship machine readable API information

Posted: Thu Dec 03, 2020 7:51 pm
by Bilka
https://github.com/knoxfighter/intellij ... completion is the one with the prototype stuff and they are maintaining it by hand. That's the second saddest thing I've read today, right after
Bilka wrote:
Thu Dec 03, 2020 11:35 am
The prototype pages are edited entirely by hand.
Poor sod.

Re: Ship machine readable API information

Posted: Thu Dec 03, 2020 7:58 pm
by Bilka
Squelch wrote:
Thu Dec 03, 2020 5:30 pm
ptx0 wrote:
Thu Dec 03, 2020 5:06 pm
I might be retarded, or I might be missing something, but isn't there a Github repository from wube that contains the prototype changes between versions? and you can download and 'diff' this repository's commits?

maybe they can put the wiki on github as well, and so it can be cloned?
You make some good points.
Wiki pages are just stored in the wiki database, so to get them into git they'd have to exported somehow. That's not an idea I'd immediately say "no" to, but it would need significant work and most likely wouldn't be machine readable either (since the entire prototype doc isn't, no matter what format you shove it into).

Re: Ship machine readable API information

Posted: Thu Dec 03, 2020 8:00 pm
by Squelch
Bilka wrote:
Thu Dec 03, 2020 7:51 pm
https://github.com/knoxfighter/intellij ... completion is the one with the prototype stuff and they are maintaining it by hand. That's the second saddest thing I've read today, right after
Ah ha! yes that's the one. I hadn't realised that was hand crafted too.
Bilka wrote:
Thu Dec 03, 2020 11:35 am
The prototype pages are edited entirely by hand.
Poor sod.
Agreed on all counts.

There has to be a better way.

Re: Ship machine readable API information

Posted: Thu Dec 03, 2020 8:10 pm
by Mooeing-747
As another data-point, https://github.com/sguest/factorio-types has a prototype parser that I've been cribbing.

Re: Ship machine readable API information

Posted: Mon Nov 21, 2022 5:38 pm
by Kuxynator
No one has yet mentions here that there is a machine-readable format since Factorio 1.1.35/16.06.2021
https://lua-api.factorio.com/latest/json-docs.html

However, I still miss an official machine-readable version of the prototype definitions.

Re: Ship machine readable API information

Posted: Mon Nov 21, 2022 5:51 pm
by Nidan
Kuxynator wrote:
Mon Nov 21, 2022 5:38 pm
However, I still miss an official machine-readable version of the prototype definitions.
According to the last paragraph of FFF368 it's being worked on; no ETA however.

Re: Ship machine readable API information

Posted: Wed Aug 16, 2023 10:28 pm
by asdff45
I just bring this up, cause finally we have a json file for prototypes as well: https://lua-api.factorio.com/latest/ind ... otype.html

Since i am the developer of https://github.com/knoxfighter/factorio-api-parser (the underlying technic for the autocompletion plugin), i tell you: I have archived the repo, but i will still host it for a while :)