Automated testing against new game versions?
Automated testing against new game versions?
I am a mod author who does not play Factorio every month. I play for a few months, then quit for a few months to a year, then come back to see new features in the game. Unfortunately this means my mods are dead/useless after a new game version comes out, until I return and update them.
I would find it very useful if there was an automated system that I could run, or that the devs would run for us, that would bump the factorio version in one of my mods and attempt to load it into each new version of the game. If an error is encountered, send me an email. Basically I'm describing continuous integration testing. Has anyone done this with Factorio mods? Would the devs be willing to do this across the whole mod portal (or maybe just let mod authors opt in?) just before releasing a new version?
I would find it very useful if there was an automated system that I could run, or that the devs would run for us, that would bump the factorio version in one of my mods and attempt to load it into each new version of the game. If an error is encountered, send me an email. Basically I'm describing continuous integration testing. Has anyone done this with Factorio mods? Would the devs be willing to do this across the whole mod portal (or maybe just let mod authors opt in?) just before releasing a new version?
Re: Automated testing against new game versions?
I was just thinking about this, too.
The other projects for which I am the primary maintainer have tests set up with Travis-CI, etc. to validate changes. Test coverage isn't always complete, but what's there is very helpful for catching issues in feature branches & pull requests before they get merged in.
It would be great to have some way of running tests on Factorio mods, besides the obvious "load the mod in a game and play with it to see if anything broke" (which is error-prone for mods with any level of complexity).
Even just a command-line parameter in the `factorio` binary that allows specifying a mod file/directory to check for loadability would go a long way.
The other projects for which I am the primary maintainer have tests set up with Travis-CI, etc. to validate changes. Test coverage isn't always complete, but what's there is very helpful for catching issues in feature branches & pull requests before they get merged in.
It would be great to have some way of running tests on Factorio mods, besides the obvious "load the mod in a game and play with it to see if anything broke" (which is error-prone for mods with any level of complexity).
Even just a command-line parameter in the `factorio` binary that allows specifying a mod file/directory to check for loadability would go a long way.
- bobingabout
- Smart Inserter
- Posts: 7352
- Joined: Fri May 09, 2014 1:01 pm
- Contact:
Re: Automated testing against new game versions?
with the amount of things changed in 0.17, it's almost guaranteed that you'll encounter errors.
I mean, if you use science packs in technology at all... error, because they're all renamed.
I mean, if you use science packs in technology at all... error, because they're all renamed.
Re: Automated testing against new game versions?
Related question: will factorio use semantic version numbers from version 1? If so mods would only break on version 2 since the major number would be bumped on breaking changes. That would reduce workload on mod authors and be really clear, compared to a 2-number scheme.
Shameless mod plugging: Ribbon Maze
- bobingabout
- Smart Inserter
- Posts: 7352
- Joined: Fri May 09, 2014 1:01 pm
- Contact:
Re: Automated testing against new game versions?
I don't know what a Semantic number is. But AFAIK, the 3 part number system will remain. we'll probably start at 1.0.0, then 1.0.1 would be a bug fix, and if any major features or changes were added, we'd go to 1.1.0H8UL wrote: Mon Feb 25, 2019 3:59 pm Related question: will factorio use semantic version numbers from version 1? If so mods would only break on version 2 since the major number would be bumped on breaking changes. That would reduce workload on mod authors and be really clear, compared to a 2-number scheme.
Re: Automated testing against new game versions?
Semantic versioning = https://semver.org/ (explains it quite thoroughly)
But this subject doesn't really have to do with testing per se…
But this subject doesn't really have to do with testing per se…
Re: Automated testing against new game versions?
using semver or not is related to testing, in that if we knew they were using semver then we would know how important/mandatory testing was for particular updates.
Re: Automated testing against new game versions?
If we could run CI/CD on mods, though, it almost wouldn't matter if the game uses semver or not. (I say "almost", because there will always be cases where test coverage misses something.)
Re: Automated testing against new game versions?
.
More practical is to minimize the frequency with which mod updates are enforced when there aren't any breaking changes to the API, hence semantic versioning. After all, the root of the OP's problem:
With semantic versioning, once software goes to version 1, then backwards-incompatible changes must bump the first number. So mods would be forced to re-release for version 1,2,3... but not for version 1.1, 1.2, 1.3, ... because there would be no backwards-incompatible changes to the modding API between 1.x and 2.x.
When there is a sales reason to keep that first version (e.g. for sequels), I sometimes adapt semantic versioning and add an extra number at the front for the sales guys. So when stable, factorio could be 1.major.minor.patch, saving 2.x for Factorio 2.
If Wube intend to follow semantic versioning, it gives some indication of how likely the OP's problem could continue after 1.0. Of course, there is nothing to stop them from bumping the first number a lot, or abandoning semantic versioning later -- but semantic versioning is very often a positive sign that api stability is baked into release management, and that breaking changes are avoided. It even has a psychological effect -- I hate breaking an API since adopting it, especially if it means bumping the major version number with no big features!
It's a big if though, don't you think? Could you imagine wube devoting resources to CI all the mods in the portal, whenever a new version is released?dgw wrote: Wed Feb 27, 2019 11:57 pm If we could run CI/CD on mods, though, it almost wouldn't matter if the game uses semver or not. (I say "almost", because there will always be cases where test coverage misses something.)
More practical is to minimize the frequency with which mod updates are enforced when there aren't any breaking changes to the API, hence semantic versioning. After all, the root of the OP's problem:
This doesn't happen on every patch, just on the major releases.sparr wrote: Tue Jan 23, 2018 7:53 pm I am a mod author who does not play Factorio every month. I play for a few months, then quit for a few months to a year, then come back to see new features in the game. Unfortunately this means my mods are dead/useless after a new game version comes out, until I return and update them.
With semantic versioning, once software goes to version 1, then backwards-incompatible changes must bump the first number. So mods would be forced to re-release for version 1,2,3... but not for version 1.1, 1.2, 1.3, ... because there would be no backwards-incompatible changes to the modding API between 1.x and 2.x.
When there is a sales reason to keep that first version (e.g. for sequels), I sometimes adapt semantic versioning and add an extra number at the front for the sales guys. So when stable, factorio could be 1.major.minor.patch, saving 2.x for Factorio 2.
If Wube intend to follow semantic versioning, it gives some indication of how likely the OP's problem could continue after 1.0. Of course, there is nothing to stop them from bumping the first number a lot, or abandoning semantic versioning later -- but semantic versioning is very often a positive sign that api stability is baked into release management, and that breaking changes are avoided. It even has a psychological effect -- I hate breaking an API since adopting it, especially if it means bumping the major version number with no big features!
Shameless mod plugging: Ribbon Maze
Re: Automated testing against new game versions?
Not suggesting that Wube be responsible for running the tests. Only that the headless binary (or a dedicated test binary) be capable of running tests with the appropriate CLI flags. Existing CI services (Travis, CircleCI, etc.) would do just fine.H8UL wrote: Sat Mar 16, 2019 11:14 pmIt's a big if though, don't you think? Could you imagine wube devoting resources to CI all the mods in the portal, whenever a new version is released?
Re: Automated testing against new game versions?
That would be fantastic -- I despise the way I have no automated tests for my mods. But the OP did suggest various levels of ambiousness including Wube running them for us, and if you're talking about eliminating the human if the tests all pass on a major version change, then you're needing more than the minimum CI capabilities. Like, a way to detect upstream version, and API with appropriate authentication to auto release.dgw wrote: Sat Mar 16, 2019 11:31 pmNot suggesting that Wube be responsible for running the tests. Only that the headless binary (or a dedicated test binary) be capable of running tests with the appropriate CLI flags. Existing CI services (Travis, CircleCI, etc.) would do just fine.H8UL wrote: Sat Mar 16, 2019 11:14 pmIt's a big if though, don't you think? Could you imagine wube devoting resources to CI all the mods in the portal, whenever a new version is released?
Or if that's not what we're talking about, then the problem that major version forces a mod release remains. I can get an email about it from my CI, but I still got to do it. But as Factorio becomes more stable, will it really be necessary for small mods to go through this process every 6-12 months?
Either way, that pesky major version number keeps coming into it -- implying the versioning scheme is important. And it must be, because CI is about change control, and the OP's problem is about forced releases.
OP seems to agree, anyway!
sparr wrote: Wed Feb 27, 2019 6:16 pm using semver or not is related to testing, in that if we knew they were using semver then we would know how important/mandatory testing was for particular updates.
Shameless mod plugging: Ribbon Maze
- bobingabout
- Smart Inserter
- Posts: 7352
- Joined: Fri May 09, 2014 1:01 pm
- Contact:
Re: Automated testing against new game versions?
Something you should all keep in mind is that before something is actually released, it's probably going to change.
I have source access, and I was constantly compiling new versions of the game (every week or 2) and checking to see what in my mods was broken, fixing it, and moving on.
about once a month, I had to change things just to make the mods load in game.
3 days before release, my mods were working fine. Upon actual release, things were broken (Which is why it took me about 6 hours before I released my mods after release)
And even since 0.17 has been released, there's been a couple of occasions where I've had to update mods because of changes.
So... something like this is... unlikely.
Besides, what can it check? does it crash when attempting to load the game? probably. Do scripts crash during gameplay? probably not, a lot of scripts have issues in very specific situations that automated tests would probably miss.
I have source access, and I was constantly compiling new versions of the game (every week or 2) and checking to see what in my mods was broken, fixing it, and moving on.
about once a month, I had to change things just to make the mods load in game.
3 days before release, my mods were working fine. Upon actual release, things were broken (Which is why it took me about 6 hours before I released my mods after release)
And even since 0.17 has been released, there's been a couple of occasions where I've had to update mods because of changes.
So... something like this is... unlikely.
Besides, what can it check? does it crash when attempting to load the game? probably. Do scripts crash during gameplay? probably not, a lot of scripts have issues in very specific situations that automated tests would probably miss.
Re: Automated testing against new game versions?
Bob, your mods are among the most complex mods that exist. You are exceptional, and your experience is relatively unique.
Most mods run fine between game versions with no changes at all other than bumping the metadata.
Most other mods break very rarely, less than once per major version on average, and giving those authors a way to get an alert when their mod breaks instead of having to check manually or wait for annoyed users would be valuable.
Most mods run fine between game versions with no changes at all other than bumping the metadata.
Most other mods break very rarely, less than once per major version on average, and giving those authors a way to get an alert when their mod breaks instead of having to check manually or wait for annoyed users would be valuable.
Re: Automated testing against new game versions?
What would automated testing look like; for the data phase, if the mod loads, you're probably good.
Perhaps, for where you derive items from other items, you would want to load up the latest version of a number of different mods to see if there's any conflicts.
For the control phase, that's a smidgen more complicated.
Perhaps be able to load a game, then run through a series of canned commands that exercise the control code?
In theory you could have a test harness that uses on_tick to issue commands to the player, then have a command that dumps out global for all loaded mods and quits the game?
Perhaps, for where you derive items from other items, you would want to load up the latest version of a number of different mods to see if there's any conflicts.
For the control phase, that's a smidgen more complicated.
Perhaps be able to load a game, then run through a series of canned commands that exercise the control code?
In theory you could have a test harness that uses on_tick to issue commands to the player, then have a command that dumps out global for all loaded mods and quits the game?