Suggestion of a way how to implement dependent probabilities

Trippelbob · Post by **Trippelbob** » Mon May 14, 2018 8:55 pm

TL;DR

Here's a suggestion where and how fields have to added in order to allow dependent ingredient and product probabilities in recipes.

What?

I suggest expanding the ability of modders to work with volatile values in recipes. Currently, it is possible to have products in a recipe that vary between minimum and maximum values and only appear with a certain probability. This does not work with recipe ingredients nor is it possible to have multiple products that are dependent of one another. As a Lua user I can only guess how the hard-coded data flow works, but I think with the following additional fields, the problem might be solved. (If not, feel free to correct me.)

What modders do when adding recipes, is expanding data.raw with a new instance of LuaRecipePrototype. This is where we should store the dependent probabilities in a triple array. A corresponding Lua code could look something like this:

dependent_probability structure

In each section, all probabilities must add up to 100%. Each string value is used to reference the double value behind it and mustn’t be used twice inside the dependent_probability array.

As soon as a recipe is executed, a LuaRecipe instance gets created. When doing so, the given double values are checked against math.random() and transferred into boolean values. The cruial point here is that in each section, only one value becomes true, while each section is independent of the others. It will be saved like this (in pseudocode):

dependent_probability in LuaRecipe

Of course, ingredients and products of recipes would have to support this new probability field. Now, the question is why ingredients currently don’t have the probability, amount_min and amount_max fields. Did nobody ask for this or does this create problems for hand-crafting-calculations? Eventually, a volatile ingredient would only mean that the number of items being consumed at the beginning of a crafting process looks differently every time.

In the case that this doesn’t represent a problem, I suggest either making the current probability field additionally supporting a string type or adding a separate field for dependent_probability only accepting string values. This string value is used to load the corresponding double value from the recipe prototype for the tooltip and to load the boolean value from the recipe instance to calculate the resulting count of ingredients and products for this recipe execution.

Why ?

With these additions, modders of Factorio would a have lots of new potential to design interesting manufacturing processes. The first big issue I would like to address is the wear of tools during crafting processes. Let’s say, a saw that is used to cut trees shall break with a 5% chance and create debris items that can be recycled. Right now, you could add the same saw with a 95% probability to the output and re-insert it every time it is not consumed or just scale up the recipe by a factor of 20. But both workarounds are far from being a pleasant solution. And after all, after such a recipe execution, you can end up with both a saw and debris items or none of it. Not really an ideal solution.

Here’s an example how this problem could be avoided with my suggestion implemented:

Example tool wear

This would mean that the sawing tool breaks with a 5% chance. Always when this happens and only in that case, debris items will be part of the output. But anyways, the big-log-item will be processed into planks. If this shall not happen after a tool break, you could add the negated probability to these items:

example tool wear 2

The other point are recipes that split one item up into several possible products. In vanilla Factorio, this can be seen in uranium-processing that converts uranium ore into the desired U-235 which spawns with a 0.007 chance and U-238 with a 0.993 chance. But just like mentioned above, you can end up with both items or none of it, so on the micro level you’re basically creating items from nowhere or sending them to nirvana.

With my suggestion implemented, this problem wouldn’t arise anymore (and you could even split an item into three or more products this way). So, the uranium recipe might look like this:

example uranium-processing

Parts of this idea have already been posted here:
viewtopic.php?f=6&t=47739&p=281168&hili ... ty#p275350
viewtopic.php?f=6&t=60145&p=361046&hili ... ty#p361046

The content of both wasn’t commented by the devs. Maybe because they weren’t posted in the API suggestions thread? Maybe because I didn’t use the fancy yellow headline template? (Yes, the second thread was created by me – but it contained just a small part of this idea and this is just a too burning issue for me

)

For everyone who read trough all of this: excuse the length of this thread, but I didn’t want to leave any ambiguity. I’m looking forward to reading your thoughts about this issue

bobingabout · Post by **bobingabout** » Tue May 15, 2018 8:24 am

I read everything but the why section, and I honestly don't have a clue what you're trying to suggest.

So I'm going to make an alternate suggestion.

The ability to specify a weighted result probability would be useful, for example, in the case of Uranium processing, to make sure you get one result OR the other, instead of the values currently being just literal, and independent, allowing you to end up with both or neither each cycle.

eradicator · Post by **eradicator** » Tue May 15, 2018 2:34 pm

As far as i remember probability based input has been rejected for technical/performance reasons.
As for weighted distribution output, does that really provide any meaningful improvement?

I'm not a master of statistics, so to prove this to myself i wrote a short python program that tests if there is any difference between the current unlinked propability and the suggested weighted distribution. The test case used is a 5%:95% situation with two possible outputs (i.e. uranium).

PYTHON CODE

Result:

Code: Select all

For 100 trials:
 Normal  : A = 7 ,B = 90
 Weighted: A = 7 ,B = 93
For 1000 trials:
 Normal  : A = 47 ,B = 956
 Weighted: A = 47 ,B = 953
For 100000 trials:
 Normal  : A = 4930 ,B = 94939
 Weighted: A = 4930 ,B = 95070

As you can see even for only 100 trials there is no relevant change in output distribution. That is to say that statistically the proposed change has no effect at all on how the game plays.

mrvn · Post by **mrvn** » Tue May 15, 2018 2:52 pm

eradicator wrote:As far as i remember probability based input has been rejected for technical/performance reasons.
As for weighted distribution output, does that really provide any meaningful improvement?

I'm not a master of statistics, so to prove this to myself i wrote a short python program that tests if there is any difference between the current unlinked propability and the suggested weighted distribution. The test case used is a 5%:95% situation with two possible outputs (i.e. uranium).
PYTHON CODE
Code: Select all
from random import randint as rint
from collections import Counter

def a(seed): # both can happen
    if seed <= 5:
        return [1,2]
    elif seed <= 95:
        return [2]
    else:
        return []

def b(seed): # only one can happen
    if seed <= 5:
        return [1]
    else:
        return [2]

# run one million trials with equal seeds for both functions
def test (trials):
    list1 = []
    list2 = []

    for i in range(1,trials):
        seed = rint(1,100)
        list1.extend(a(seed))
        list2.extend(b(seed))

    x = Counter(list1)
    y = Counter(list2)

    print('For',trials,'trials:')
    print(' Normal  : A =',x[1],',B =',x[2])
    print(' Weighted: A =',y[1],',B =',y[2])

for i in [100,1000,100000]:
    test(i)
Result:
Code: Select all
For 100 trials:
 Normal  : A = 4 ,B = 92
 Weighted: A = 4 ,B = 95
For 1000 trials:
 Normal  : A = 53 ,B = 957
 Weighted: A = 53 ,B = 946
For 100000 trials:
 Normal  : A = 5032 ,B = 94923
 Weighted: A = 5032 ,B = 94967
As you can see even for only 100 trials there is no relevant change in output distribution. That is to say that statistically the proposed change has no effect at all on how the game plays.

First you have an off-by-one error in there. You only do 99/999/9999 trials.

Second your a() function isn't how the game works I think. Should roll two dice. The first is checked against 5% for the first output, the second is checked against 95% for the second output.

Third I think you missed the point. In the long rung the average will be 5% + 95% = 1 item per turn. But in the short run you might end up with 10 turns giving you both items. Then 200 turns giving you nothing. Unlikely but possible. This is annoying when you start and feed something by hand and can block production because too much output collects for the inserters to keep up at some times while at other times they sit idle doing nothing.

It also makes much more sense. You put in some raw materials, you get out some product. You always get the same amount of product. Only the type of product differs. It makes sense where you expect the input and output to have basically equal mass. Mass doesn't suddenly double or disappear.

eradicator · Post by **eradicator** » Tue May 15, 2018 3:06 pm

mrvn wrote:First you have an off-by-one error in there. You only to 99/999/9999 trials.

That was fixed one minute after posting. You were probably already typing though :P.

mrvn wrote:Second your a() function isn't how the game works I think. Should roll two dice. The first is checked against 5% for the first output, the second is checked against 95% for the second output.

True. But of doubtful relevance to the final result. Feel free to fix and post results.

mrvn wrote:Then 200 turns giving you nothing. Unlikely but possible.

The sun could also spontaneously explode. Unlikely but possible.

mrvn wrote:Third I think you missed the point. In the long rung the average will be 5% + 95% = 1 item per turn. But in the short run you might end up with 10 turns giving you both items.

I think it is you who missed my point. I'm saying that the "short run" is irrelevant for normal gameplay because nobody sits there all day babysitting their assemblers. As such i don't think it's worth implementing this as it requires some rather large changes to how the game processes recipes - if it is at all possible without affecting performance for all recipes. Realism/mass conservation are irrelevant here, it's just a game mechanic.

TL;DR:
Would i use it if it was implemented? Sure. But only because it looks nicer, not because it changes anything relevant.

Arch666Angel · Post by **Arch666Angel** » Tue May 15, 2018 3:20 pm

Really anything that will let us change recipes around in an interessting way is most welcome, cause they are the core of whats the is about. Which also means that fumbling around with this system has to be done in a careful to not break all the things. But I'm all in for an one OR the other probability System. Or a result table to roll on for a single result.

Trippelbob · Post by **Trippelbob** » Tue May 15, 2018 5:40 pm

bobingabout wrote:I read everything but the why section, and I honestly don't have a clue what you're trying to suggest.

So I'm going to make an alternate suggestion.

The ability to specify a weighted result probability would be useful, for example, in the case of Uranium processing, to make sure you get one result OR the other, instead of the values currently being just literal, and independent, allowing you to end up with both or neither each cycle.

Excuse me for leaving misapprehension but I have to admit that your response surprises me a little as I thought that you as one of the main modders here practically knows the Factorio API by heart

Your own suggestion is basically what would result in implementing my described technical details (as can be read in the Why-section

). So let’s sum it up a little:

1) Current state: probability is only accessible in the “product” concept.

2) Also current state: recipes have an array of products that each have their unique probability, so there is no chance of linking several product probabilities or even those of ingredients.

3) Now comes my suggestion: That’s why we should implement dependent probabilities in LuaRecipePrototype. For doing so, I suggest a triple array in that place which has these properties:
- we can define multiple dependent probability “groups” or “sections” per recipe.
- each group consists of 2 or more single probabilities “items” that are dependent of one another and must add up to 100%.
- each single probability item is referenced by a unique string for being identified by the product and ingredient concept.

4) When a recipe is executed in-game, a LuaRecipe instance is generated from the prototype. At this point, we must decide which single probability item of each group becomes true while the rest becomes false. That’s how we can influence which ingredient will be consumed or which product will be produced.

5) How this might look in Lua can be seen in my code examples in the original post.

sthalik · Post by **sthalik** » Wed May 16, 2018 2:09 am

Does the API respect metatables/prototypes? If so, you may be able to do it like that. I unconditionally return "42" but you can do what you want in the "__index" method inside the metatable/prototype.

Code: Select all

> Foo = {}
> Foo.mt = {}
> function Foo.new(self) setmetatable(self, Foo.mt); return self; end
> function Foo.mt.__index(table, key) return 42; end
> datum = Foo:new()
> datum.bleh
42
> datum
table: 00000000001c99a0

A bit of an explanation here: if you do "setmetatable" on what you're passing to data:extend(), you should be able to return a precomputed value, rather than a constant value member.

This depends on the API respecting metatables, and calling "products"/"probability" when it's needed, rather than only once and caching the result. Add a print statement in the __index function and see what happens.

bobingabout · Post by **bobingabout** » Wed May 16, 2018 8:11 am

Trippelbob wrote:Excuse me for leaving misapprehension but I have to admit that your response surprises me a little as I thought that you as one of the main modders here practically knows the Factorio API by heart

No, I actually don't do much scripting at all, and what you listed there is the scripting API, what you call a product I know as a result, because that's what it's called in the data phase. ah, the inconsistencies!
When it comes to scripting, I reference that material heavily, but usually only the area that I'm currently working on, basically learning it as I'm doing it.
I do however know must of the data phase by heart. Maybe I should have read all the way down to the Why section, rather than just reading the top part, but usually, the top part is all you need to read to know what a sugestion is, that's why the templayed is laid out like it is.

Trippelbob wrote:That’s why we should implement dependent probabilities

See, that term confuses me, what the heck is a dependant probability?

Trippelbob wrote:4) When a recipe is executed in-game, a LuaRecipe instance is generated from the prototype. At this point, we must decide which single probability item of each group becomes true while the rest becomes false. That’s how we can influence which ingredient will be consumed or which product will be produced.

Why do we need to generate a recipe each time it is executed? if programmed correctly, you wouldn't need to do that, just run it.

Anyway, I wouldn't define things the way you suggested them. also wouldn't call it "dependant probability", so let me have a go.

Code: Select all

{
  type = "recipe",
  name = "uranium-processing",
  energy_required = 10,
  enabled = false,
  category = "centrifuging",
  ingredients = {{"uranium-ore", 10}},
  icon = "__base__/graphics/icons/uranium-processing.png",
  icon_size = 32,
  subgroup = "raw-material",
  order = "k[uranium-processing]",
  results =
  {
    {
      type = "choice"
      results =
      {
        name = "uranium-235",
        weight = 7,
        amount = 1
      },
      {
        name = "uranium-238",
        weight = 993,
        amount = 1
      }
    }
  }
},

Short description. Add a new item type of "choice", or name it whatever you like, this means that it's a single result in the end, and will be rolled each time.
Since we know there will always be one result in the choice, either one item or another, then the item has a weight, rather than a probability. what's the difference? A probability needs to add up to 100% (or 1), where a weight is just the chance out of the total weight of the entire decision for this outcome. The chance is simply the weight for this one item, vs the weight of all items added together.
so you could have a weight of 1 for one item, and 999999 for another, for a 1 in a million chance.

Trippelbob · Post by **Trippelbob** » Wed May 16, 2018 4:45 pm

bobingabout wrote:See, that term confuses me, what the heck is a dependant probability?

So, what’s a dependent probability? I took this term from the first thread I cited in the original post. Let’s give an example: the probability of receiving number 1 when rolling a dice is p=1/6. Receiving one number out of 2-6 with that same throw is p=5/6. It isn’t possible that both events occur at the same time, so they are dependent of one another. Whereas rolling two different dice means that the first might give you number 1 and the other dice might give you one of the other numbers. The numerical probability values for these events haven’t changed, but the two dice are independent of one another.
And this is how Factorio currently works which is what both of us don’t like, right? According to your suggestions, we’re talking about the same thing, might it be called “dependent-probability” or “weighted-single-result”. Although I would also like to see this become possible for ingredients.

bobingabout wrote:Why do we need to generate a recipe each time it is executed? if programmed correctly, you wouldn't need to do that, just run it.

Honestly, I’m not quite sure, if that LuaRecipe instance only gets created when choosing a recipe or if it gets re-created every time when the crafting process bar begins another run. I guessed for the second. Even if that’s wrong, we could still refresh the boolean array that I talked about every time when it starts again.

bobingabout wrote:Short description. Add a new item type of "choice", or name it whatever you like, this means that it's a single result in the end, and will be rolled each time [...]

Ultimately, I really don’t care about the actual way of implementation but rather the resulting new ways of creating mod recipes. In order to bring the discussion a little further and not just to suggest the same basic abstract idea another time, I figured out a possible (?) way how it might look like in Lua and the API. You’re right by suggesting weights rather than odd floating-point numbers as they seem to be easier to use and don’t have to add up to 100%. But still, we don’t get around the question how this might look in the API and that was sort of my main topic here.

bobingabout · Post by **bobingabout** » Mon May 21, 2018 9:12 am

One of the things I like about my method, is that you could do this.

Code: Select all

  results =
  {
    {
      type = "choice"
      results =
      {
        {
          type = "choice"
          results =
          {
            name = "uranium-235",
            weight = 7,
            amount = 1
          },
          {
            name = "coal",
            weight = 3,
            amount = 1
          }
        }
      },
      {
        name = "uranium-238",
        weight = 990,
        amount = 1
      }
    }
  }

I don't know why you would, but you could.

But yes, ultimately, We both want the same thing, and it doesn't really matter to me how it's implemented either, as long as it makes sense.

sthalik · Post by **sthalik** » Tue May 22, 2018 8:07 am

Can you add arbitrary metadata to data.raw? Will the game consider this an error and refuse to load the mod?

If you can in fact add arbitrary things in there, a mod can traverse data.raw and locate all that custom metadata. For instance add a dummy item signifying the actual choice.

Sorry if these questions are elementary, I'm really new to modding Factorio. The docs are sort of sparse :(

bobingabout · Post by **bobingabout** » Tue May 22, 2018 8:35 am

you can add whatever additional data you want, but the game will just ignore it, and not even store it in the database, so once the full data phase has finished loading (title screen appears) it's all discarded.

So... no, you can't just store data about recipes in data.raw that you can use later.

Factorio Forums