Thank you, never heard of that, i had to watch some more videos to not feel lost :
How Blurs & Filters Work - Computerphile ( 8 min ) about kernel convolution, what math is being done, visually explained.
https://www.youtube.com/watch?v=C_zFhWdM4ic
Finding the Edges (Sobel Operator) - Computerphile (8 min ) combining several of them and using it for an image editing purpose
https://www.youtube.com/watch?v=uihBwtPIBxM
CNN: Convolutional Neural Networks Explained ( 15 min ) how they are combined in a neural network like the very genery one described in the early video.
https://www.youtube.com/watch?v=py5byOOHZM8
That's not related to gaming AI, but there are pictures and drawing very similar to a board game, those helped me "vizualize" what you called "perform a 3x3 convolution" ; what is could be used for ; some examples of vulgarisation too. The videos are referenced in each others as they are related.
deep_remi wrote: ↑Tue Apr 26, 2022 8:46 am
- The "parameter 1" blocks multiply each input channel by 9 constant values
- The 9 outputs are summed over input channels, so it produces 9 different values that are fed into the large array of 3x3 additions. These are full-board values (one different signal for each square of the board)
- Each of the 3x3 block of the large array computes one signal as the sum of the 3x3 different signals from the local neighborhood. You may notice that 3x3 blocks on the side are a bit different, because there is not value coming from the outside of the board.
- The 15x15 signals obtained by these 3x3 addition blocks are combined into the output for this channel
Vertical wires in Factorio overlap, so it makes it a bit difficult to visualize the connection pattern.
It's easier to see the wiring when the 3x3 alternate rotation. Now i realise the large array is composed of 3x3 blocks organised in a 15x15 grid like the board. (45x45)
deep_remi wrote: ↑Tue Apr 26, 2022 8:46 am
What you labelled "parameter 1" are the 3x3 convolution weights. "parameter 2" is the bias, and ReLU (Rectified Linear Unit) activation. The large arrays with 3x3 blocks are indeed all identical to each other. They perform the additions of the convolutions (add all 3x3 values around each point of the board).
That make sense now, the word "filter" where i used it is not correct; because it has another particular meaning in this context. seemed to be associated with the convolution weights in the video you linked; a "filter" is a a 3x3 matrix containing some numbers that the AI while training adjusted to detect something that helped her win games.
Whereas the large array is a way to do paralellization of all operation that constitute the convolution. It is the factorio implementation of the operation, only arithmetic combinators.
Injected values from the constant combinators are used to make sure that only one move is selected, even if the neural network gives two identical values to two different moves. There is a bitwise "and" combinator that clears the 8 least significant bits, and the constants have a different value for each signal (1, 2, 3, ...) to make sure that the resulting signals are all different from each other. Also, a big negative value is added to occupied squares, such that an illegal move is not selected.
I spotted this located at the intersection of the green and red box in the annoted pictures.
For games such as gomoku, Othello, or Go, a move is a point of the board, which is simpler to handle. The convolutional neural network produces a value for each point of the board, and may give a high value to points that are illegal moves. That's why it is important to filter out illegal moves when picking the move chosen by the network.
I don't do this, but it would be possible to train the neural network to give low values to illegal moves. But it would still be not very safe to rely on the network's decision about move legality, because it might produce inaccurate outputs. So the system should be able to determine which move is legal or not.
For games where pieces are moved, such as chess, things are more complicated. We usually have a different output channel for each piece of the board, and each direction of movement. DeepMind's AlphaZero uses 73 convolutional output neurons to encode a chess move (see table S2 in that paper:
https://arxiv.org/pdf/1712.01815.pdf).
This is surprising, i expected the machine to only be allowed to do legal move during training by design and therefore be unable to even propose an illegal move when the resulting network is exported and playing. But an illegal "ouput" is more complex to define in chess as it depend on what piece is selected first which generates illegal move (pieces do not have same rule of movement) and then on what is on the board around that piece ( what square is occupied, by who ) whereas in Gomoku, it's only the 2nd part that matter. Maybe that explain the different approach ?
The AI needs at least one piece of each color on the board in order to play.
Here is a gomoku opening book I built with a strong neural network. It can help you to choose a balanced opening:
https://www.crazy-sensei.com/book/gomoku_15x15/
I also noticed that the AI will sometimes miss an immediate win. 8 layers of 8 neurons is
really a very small network, and it is not big enough
to produce a really very smart AI.
I have found 1 way to win when i put their pawn on the corner and i start with 2 in the middle, hahaha, i don't need balanced opening i need less obvious way to cheat
I tried playing the exact same again to see if the AI would do something different, in all i've looked i haven't seen some pseudo-randomness added, but i couldn't remember exactly my own moves and the game went different. I expect the same exact game to be happening though if i write down my move ?
deep_remi wrote: ↑Tue Apr 26, 2022 8:46 am
If you look carefuly, you'll notice that there also some "shortcut" connections that skip one layer. It is a so-called "residual" neural network:
https://en.wikipedia.org/wiki/Residual_neural_network
Only one channel of the 4 outputs is used. The other 3 are evaluation neurons that estimate the probability of win, draw, and loss. But they are not used here.
I tried to optimize performance by powering off the network when it is not used, but it does not help at all. It seems that combinators use almost exatcly the same CPU time whether they are powered on or off.
(I have re-ordered and altered some quote)
I noticed the "shortcut", the wikipage contain 3 times "clarifications needed" in the only 3 paragraphs that are not obscure formula
. Though i don't think i really need to understand WHY they are here for my idea of optimization. They do make things a bit more complicated but the math is not fundamentally different.
( i removed the 3 unused neurons for my testings )
When i loaded the map on a 2009 laptop, it was running at 18 UPS or so, and when looking at what was taking the most time using the F4 menu and "show-time-usage". I noticed it's not the circuit network that consume the update time but the entity for around 38 or so, electric network for 10 and circuit network accounted only for 2 roughly. Therefore my conclusion is that there are too many combinators
. Not that the value are constantly changing or that there are too many calculations being done.
I think for this particular case the massive parralelization occuring is actually not benefiting the UPS metric. I think if there was only 1 single neuron/ array of (45x45) instead of 68, the Gomoku itself would be slower to unfold in real time, but you would still get 60 UPS, potentially even more. It's just an idea, but if you allow yourself to work with some delay of 2 or 3 second for example, 120 180 ticks. Then you can create a timer after each human move during which you have 3 ticks to use the single neuron as the neuron 1, and store the result, then 3 ticks to use the single neuron as neuron 2, and store the result, then 3 ticks to use the single neuron as neuron 3 ans store the result. and so on.
This would require to have the 68 "parameter 1" and the 68 "parameter 2", stored both in constant combinators and maybe 9 or 27 unique channels (on top of the 15x15 already used for the board and the few used for the player turn and reset. ) And a better cartography of where the "shortcuts" are located than what i have at the moment. But from what i've seen it would be feeding to neuron 16 located in a bottom layer, the result of neuron 4 located in an upper layer, sometimes.(made up illustrative numbers) Carefully managing the serialization. Swapping the parameter as input of the single neuron for the minimum amount of time to perform the convolution.
3 second or 180 tick delay is enough i think as it gives 3 ticks for the swapping, but maybe it's possible with only 2 ticks, maybe 4 or 5 ticks are required to store and fetch parameters. and that would mean the AI would take 5 or 6 second to make a move, but maybe those 5 or 6 second you can have 120 or 180 UPS. That's the kind of trade-off i'm seeing possible here. Maybe using only 8 neurons instead of 1 or 68 is a good compromise, to reduce the total number of active/powered entity as it seemed to be the cause of UPS-loss for me.