Parrallel processing in games & applications

hoho · Post by **hoho** » Thu Feb 09, 2017 12:55 pm

Zulan wrote:
hoho wrote:There is also the thing that multithreading can ever only give *at best* linear scaling with core count.
By the way, there is super-linear speedup. Allthough that is something you wouldn't count on / expect.

Pretty much the only time this can happen is if you are memory-bound in a single thread and can get around that with multithreading. Basically, either NUMA or having the code be able to efficiently use caches without hammering RAM as much as with single threaded solution.

Problem is, pretty close to no one is using NUMA in machines they play Factorio (2+ CPUs with independent memory pools for each CPU) with and I'm almost certain factorio memory access patterns are the kind that won't really benefit from spreading caches over multiple cores all that much.

Zulan · Post by **Zulan** » Thu Feb 09, 2017 7:03 pm

hoho wrote: Pretty much the only time this can happen is if you are memory-bound in a single thread and can get around that with multithreading. Basically, either NUMA

No, that doesn't get you super-linear speedup. Two memory controllers are only twice as fast as one.

hoho wrote: or having the code be able to efficiently use caches without hammering RAM as much as with single threaded solution.

Problem is, pretty close to no one is using NUMA in machines they play Factorio (2+ CPUs with independent memory pools for each CPU) with and I'm almost certain factorio memory access patterns are the kind that won't really benefit from spreading caches over multiple cores all that much.

Yes, as I said, it's a rare thing to happen. My point was never to expect super-linear speedup, but to indicate that linear is not necessarily the upper limit.

There seem to be some misconceptions about NUMA. The main point that seems to be In a system with a single memory controller, memory bound programs do not benefit from using multiple cores. That is not true. Take a look at these benchmark results, compare the last two columns for parallel vs single core sequential RAM bandwidth. Unfortunately the table is not quite up to date, but anyway the gap for desktop systesms is indeed only about 25 %. That speedup would be hard to justify the effort of parallelizing a code like Factorio.

However most of the things I read here lead me to believe Factorio is mainly memory latency bound. In that case, the random walk bandwidth benchmark becomes more relevant. In this case you can get linear speedups on desktop systems. This is conclusive with the results by Harkonnen. But even in that case, it would be more significant to improve memory access locality.

mrvn · Post by **mrvn** » Mon Feb 13, 2017 10:43 am

Rseding91 wrote:
Zulan wrote:
Rseding91 wrote:
Zulan wrote:One general question that would help my understanding of what concepts can be applied: In what order are the main update methods of entities currently executed?
In the order they're marked as active on a given chunk which changes as they go offline due to not having work to do.
So it's not even grouped by type of entity (e.g. first Factories, then drills, then belts, then inserters, ...)? Is there a particular reason for that ordering?
No. I tried grouping entities by type and updating them in that order and it had zero measurable impact - neither positive or negative.

I think that's where memory pools come into play. Since you don't use them I assume the entities are randomly allocated as needed and you end up with them spread all over the memory. So doing all drills in order will not give you sequential memory access. On the other hand if you have a memory pool just for drills you can prefetch the memory for the next drill while doing the first. Sometimes that is a huge improvement.

It can also save a lot of memory since you can use different alignment in a pool. For example a 24 byte structure can be 8 byte aligned in a pool instead of the generic 16 byte alignment saving 25% of memory.

mrvn · Post by **mrvn** » Mon Feb 13, 2017 10:52 am

Rseding91 wrote:Threading the entity update cycle of Factorio has one *massive* problem:

The top 5 CPU consuming entities:

Belts (and all derivatives)

Biters

Inserters

Pipes

Robots
All of them are "slow" because they interact with other entities or other data sets outside the given entity and as such can't be threaded outright without adding data-races. Threading everything except the "external interaction" gives very minimal gains and makes the codebase shit to work with.

Right now our turn around time on most bugs is < 1 day. In most cases less than an hour after a developer starts working on a given bug. Adding hacked threading into the mix will as I said above: make the codebase shit to work with and make bug fixing and feature work take magnitudes more time while in the end giving minimal gains (<10%~).

It's just not worth it. At this stage in Factorio development adding in threading would be a colossal mistake that would slow development and cause far more harm than good.

Belts interact with inserters so that is a problem. But do biters interact with belts in any blocking way? It's not like a belt will stop moving because a biter is near.
So why not multithread there. One thread moving all the belts. One thread moving bitters. One thread doing pipes.

There are some interactions there, like a biter destroying a belt. But it should be easy to decide the biter will attack the belt and then in the next tick the belt sees it got attacked and lost it's last heath point and blows itself up. By spreading cause and effect across 2 ticks a lot of waiting can be avoided.

Similar with robots (and biters). I assume the interaction there is only biters seeing robots in flight and attacking them. I haven't seen robots avoiding biters. Flights can be scheduled in advance, the bot knows it's going in a straight line for X ticks. If something happens to change the plan that can be spread over 2 ticks too. The robot would change it's mind and file a new plan starting at the next tick. Call it reaction time.

ratchetfreak · Post by **ratchetfreak** » Mon Feb 13, 2017 11:09 am

mrvn wrote:
Rseding91 wrote:
Zulan wrote:
Rseding91 wrote:
Zulan wrote:One general question that would help my understanding of what concepts can be applied: In what order are the main update methods of entities currently executed?
In the order they're marked as active on a given chunk which changes as they go offline due to not having work to do.
So it's not even grouped by type of entity (e.g. first Factories, then drills, then belts, then inserters, ...)? Is there a particular reason for that ordering?
No. I tried grouping entities by type and updating them in that order and it had zero measurable impact - neither positive or negative.
I think that's where memory pools come into play. Since you don't use them I assume the entities are randomly allocated as needed and you end up with them spread all over the memory. So doing all drills in order will not give you sequential memory access. On the other hand if you have a memory pool just for drills you can prefetch the memory for the next drill while doing the first. Sometimes that is a huge improvement.

It can also save a lot of memory since you can use different alignment in a pool. For example a 24 byte structure can be 8 byte aligned in a pool instead of the generic 16 byte alignment saving 25% of memory.

The real mem cost of lots of malloced bits of memory is the overhead of the prefix that the allocator prepends to just about every allocation. Those will be 4 or 8 bytes at least to store the size of the allocation + some data about the next/previous chunk of memory it was allocated from. I dislike the malloc/free api for this reason.

mrvn · Post by **mrvn** » Tue Feb 14, 2017 11:20 am

ratchetfreak wrote:
mrvn wrote:
Rseding91 wrote: No. I tried grouping entities by type and updating them in that order and it had zero measurable impact - neither positive or negative.
I think that's where memory pools come into play. Since you don't use them I assume the entities are randomly allocated as needed and you end up with them spread all over the memory. So doing all drills in order will not give you sequential memory access. On the other hand if you have a memory pool just for drills you can prefetch the memory for the next drill while doing the first. Sometimes that is a huge improvement.

It can also save a lot of memory since you can use different alignment in a pool. For example a 24 byte structure can be 8 byte aligned in a pool instead of the generic 16 byte alignment saving 25% of memory.
The real mem cost of lots of malloced bits of memory is the overhead of the prefix that the allocator prepends to just about every allocation. Those will be 4 or 8 bytes at least to store the size of the allocation + some data about the next/previous chunk of memory it was allocated from. I dislike the malloc/free api for this reason.

You are right. But 4 or 8 bytes? When did you last look at a memory allocator? The allocator needs to know the size of the chunk of memory because "free(addr)" doesn't tell it how large it is. And ignoring 32bit systems we can happily allocate 8GB ram. So the size needs to be 64bit too. Often instead of the size a pointer to the next block is used. Either way you need 8 byte just for that. Then you usually also need to know the size of the previous block or pointer to it so free memory can be merged, resulting in 16 byte header. Which works out nicely with 16 byte aligned chunks for best preformance with SSE. But if the allocator also has a canary then you get a 24 byte header and (size+24) rounded up to the next multiple of 16. So common overhead per allocation is 24-32 byte.

On the other hand a memory pool for a fixed size object can be as efficient as 16 byte overhead for the whole pool while 32 or 64 byte is more realistic. Per pool, not per object. So my previous 25% was understating it just a bit. It's 24 byte instead of 64, close to 1/3rd the memory footprint.

Note: Some malloc implementations do use pools inernaly for small objects. I don't think glibc under Linux does though.

ratchetfreak · Post by **ratchetfreak** » Tue Feb 14, 2017 11:49 am

mrvn wrote:
You are right. But 4 or 8 bytes? When did you last look at a memory allocator? The allocator needs to know the size of the chunk of memory because "free(addr)" doesn't tell it how large it is. And ignoring 32bit systems we can happily allocate 8GB ram. So the size needs to be 64bit too. Often instead of the size a pointer to the next block is used. Either way you need 8 byte just for that. Then you usually also need to know the size of the previous block or pointer to it so free memory can be merged, resulting in 16 byte header. Which works out nicely with 16 byte aligned chunks for best preformance with SSE. But if the allocator also has a canary then you get a 24 byte header and (size+24) rounded up to the next multiple of 16. So common overhead per allocation is 24-32 byte.

On the other hand a memory pool for a fixed size object can be as efficient as 16 byte overhead for the whole pool while 32 or 64 byte is more realistic. Per pool, not per object. So my previous 25% was understating it just a bit. It's 24 byte instead of 64, close to 1/3rd the memory footprint.

Note: Some malloc implementations do use pools inernaly for small objects. I don't think glibc under Linux does though.

I forgot to add "at least" to that sentence, in cases the allocator uses just a pointer or index into an auxiliary metadata store.

I would very much prefer if malloc also returned some metadata for the block that needs to be passed in when freeing or reallocating. It would also let user code discover the actual size of the allocated block without having to know the internals of the allcoator which could changes without warning, for example usually optimal allocation size is those 24-32 bytes of overhead less than a power of 2 (up to page size 4k) because that way the allocated block doesn't overlap into the next block. A container could be able to use the extra space that becomes available that way.

orost · Post by **orost** » Wed Feb 15, 2017 1:33 am

Here's something that has helped me with locality-related performance problems.

There is an obscure data structure called a colony. It's essentially a linked list of arrays that keeps track of their contents and creates and deletes them as needed. It combines the characteristics of std::list and std::vector: it allows very fast, arbitrary insertions and deletions without invalidating pointers or iterators and has iteration performance not too far off from a vector. The tradeoff is a memory penalty, which varies depending on your usage pattern, but is usually modest.

So if you have entities on the heap and a vector of pointers to them and lack of locality is killing your performance, it should be fairly straightforward (I don't know your codebase, so idk for sure, but it should) to replace that with a colony that directly contains those entities, for a major performance gain, at least when it comes to iterating through it or nearby accesses.

Here's a talk about colonies, with an explanation of how they work, some benchmarks, and a link to an implementation.

https://www.youtube.com/watch?v=wBER1R8YyGY

Post by **Rseding91** » Wed Feb 15, 2017 2:26 am

orost wrote:Here's something that has helped me with locality-related performance problems.

There is an obscure data structure called a colony. It's essentially a linked list of arrays that keeps track of their contents and creates and deletes them as needed. It combines the characteristics of std::list and std::vector: it allows very fast, arbitrary insertions and deletions without invalidating pointers or iterators and has iteration performance not too far off from a vector. The tradeoff is a memory penalty, which varies depending on your usage pattern, but is usually modest.

So if you have entities on the heap and a vector of pointers to them and lack of locality is killing your performance, it should be fairly straightforward (I don't know your codebase, so idk for sure, but it should) to replace that with a colony that directly contains those entities, for a major performance gain, at least when it comes to iterating through it or nearby accesses.

Here's a talk about colonies, with an explanation of how they work, some benchmarks, and a link to an implementation.

https://www.youtube.com/watch?v=wBER1R8YyGY

Entities aren't updated in the order they're allocated and additionally virtually all entities that are updatable have dynamically allocated member data or touch other entities/data sets they don't own.

So, none of that actually works in a real-world scenario where all things that do logic end up being dynamic and fragmented in memory.

When something isn't fragmented or has incredibly simple logic it's never something that is capable of doing any useful logic due to those limitations: not being able to touch memory outside of itself.

orost · Post by **orost** » Wed Feb 15, 2017 2:44 am

I'm not sure I understand what you're saying. Handling objects that are externally referred to by lots of other objects is the whole point of colonies. Iterators and pointers to an element of a colony are guaranteed to remain valid as long as that element exists, so you can refer to elements of a colony from other objects in exactly the same way as you'd refer to heap objects, while still having the benefit of storing them in large-ish continuous blocks that play well with cache.

But if you aren't iterating through containers of entities front-to-back, then, yeah, most of the benefit would be wasted.

ratchetfreak · Post by **ratchetfreak** » Wed Feb 15, 2017 11:24 am

Rseding91 wrote: Entities aren't updated in the order they're allocated and additionally virtually all entities that are updatable have dynamically allocated member data or touch other entities/data sets they don't own.

So, none of that actually works in a real-world scenario where all things that do logic end up being dynamic and fragmented in memory.

When something isn't fragmented or has incredibly simple logic it's never something that is capable of doing any useful logic due to those limitations: not being able to touch memory outside of itself.

I still feel that you could turn a lot of per-frame poll updates into rare push updates.

For example an inserter waiting to pick something up from a belt into a chest. Instead of the inserter checking the chest and belt to see if it could pick up an item and the chest has space for it. The belt could notify the inserter that a new item has come into its range, the chest could also notify the inserter that space for a type of item has become (un)available.

Basically rather than let the entity check a complex and remote condition every frame let the entity sleep until an event that could change the condition occurs. After that the entity could double check the condition so you can have a coarser granularity of notification. For debugging you could make the game run in always-check mode and assert/log each time a condition changes without also getting an event that frame.

It does necessitate some kind of event subscription service. Like an inserter subscribing to a chest's inventory change/destruction or a location where something with an inventory could be built, a bit of track where a train could park,... but if sending an event is an order of magnitude less common than the entity would need to check without it it could very well be worth it.

Post by **Rseding91** » Wed Feb 15, 2017 2:15 pm

ratchetfreak wrote:
Rseding91 wrote: Entities aren't updated in the order they're allocated and additionally virtually all entities that are updatable have dynamically allocated member data or touch other entities/data sets they don't own.

So, none of that actually works in a real-world scenario where all things that do logic end up being dynamic and fragmented in memory.

When something isn't fragmented or has incredibly simple logic it's never something that is capable of doing any useful logic due to those limitations: not being able to touch memory outside of itself.
I still feel that you could turn a lot of per-frame poll updates into rare push updates.

For example an inserter waiting to pick something up from a belt into a chest. Instead of the inserter checking the chest and belt to see if it could pick up an item and the chest has space for it. The belt could notify the inserter that a new item has come into its range, the chest could also notify the inserter that space for a type of item has become (un)available.

Basically rather than let the entity check a complex and remote condition every frame let the entity sleep until an event that could change the condition occurs. After that the entity could double check the condition so you can have a coarser granularity of notification. For debugging you could make the game run in always-check mode and assert/log each time a condition changes without also getting an event that frame.

It does necessitate some kind of event subscription service. Like an inserter subscribing to a chest's inventory change/destruction or a location where something with an inventory could be built, a bit of track where a train could park,... but if sending an event is an order of magnitude less common than the entity would need to check without it it could very well be worth it.

Most of the game already works like this. That's why update order is seemingly "random". You can watch things go inactive and wake up by enabling the "active state" debug option.

Zulan · Post by **Zulan** » Wed Feb 15, 2017 6:02 pm

An interesting anecdote related to memory fragmentation. During AntiElites last 100% achievement speedrun, he died the first time quite late in the game, where UPS already dropped significantly. After reloading the game due to the happy accident with biters, there was a noticeable increase in UPS. Now unfortunately due to the nature of the speedrun, this is not a reliable experimentation. This is extremely difficult to reproduce, given that it requires to organically build up a huge UPS-eating base without loading the game in between. Also UPS is highly volatile in that stage of the game.

The question that arises: Are entities saved by geographic location, and hence is the memory fragmentation maybe much less of an issue than one might thing given that most active entities are loaded from a save rather than dynamically created during a play session.

Post by **Rseding91** » Wed Feb 15, 2017 6:07 pm

Zulan wrote:The question that arises: Are entities saved by geographic location...

Yes they are.

mrvn · Post by **mrvn** » Thu Feb 16, 2017 9:45 am

Zulan wrote:An interesting anecdote related to memory fragmentation. During AntiElites last 100% achievement speedrun, he died the first time quite late in the game, where UPS already dropped significantly. After reloading the game due to the happy accident with biters, there was a noticeable increase in UPS. Now unfortunately due to the nature of the speedrun, this is not a reliable experimentation. This is extremely difficult to reproduce, given that it requires to organically build up a huge UPS-eating base without loading the game in between. Also UPS is highly volatile in that stage of the game.

The question that arises: Are entities saved by geographic location, and hence is the memory fragmentation maybe much less of an issue than one might thing given that most active entities are loaded from a save rather than dynamically created during a play session.

I noticed that too on a slow system. When starting again from a save game the net day it always was faster for a while.

mrvn · Post by **mrvn** » Thu Feb 16, 2017 9:57 am

Rseding91 wrote:
ratchetfreak wrote:
Rseding91 wrote:I still feel that you could turn a lot of per-frame poll updates into rare push updates.

For example an inserter waiting to pick something up from a belt into a chest. Instead of the inserter checking the chest and belt to see if it could pick up an item and the chest has space for it. The belt could notify the inserter that a new item has come into its range, the chest could also notify the inserter that space for a type of item has become (un)available.

Basically rather than let the entity check a complex and remote condition every frame let the entity sleep until an event that could change the condition occurs. After that the entity could double check the condition so you can have a coarser granularity of notification. For debugging you could make the game run in always-check mode and assert/log each time a condition changes without also getting an event that frame.

It does necessitate some kind of event subscription service. Like an inserter subscribing to a chest's inventory change/destruction or a location where something with an inventory could be built, a bit of track where a train could park,... but if sending an event is an order of magnitude less common than the entity would need to check without it it could very well be worth it.
Most of the game already works like this. That's why update order is seemingly "random". You can watch things go inactive and wake up by enabling the "active state" debug option.

It's called event based programming

The problem is that you just inverted the problem. Now instead of the inserter checking the belt every frame now the belt wakes up the inserter every item. A lot of the time inserters wait on their destination and not on their source. E.g. waiting for space in a chest. But every items going by on the belt will wake it up. There isn't an item every tick so that is still a win, but ...

The next question becomes: Do you unsubscribe the inserter from the belt when it is blocked at the other end? If you make subscribing cheap then the inserter can subscribe to the source or destination depending on which end it is waiting on. Hey, you can even subscribe to the power pole if it is out of energy.

The other question is: How expensive is emiting the vent when nothing is connected to it? There are a lot of belts with no inserter next to it. They all need to check if they have to wake one up when a new item comes by just to see that nothing is connected.

It's all a balancing act. The time you save on one end might be wasted on the other. And only testing real games will tell you which side costs more time.

ratchetfreak · Post by **ratchetfreak** » Thu Feb 16, 2017 11:54 am

mrvn wrote:
It's called event based programming

The problem is that you just inverted the problem. Now instead of the inserter checking the belt every frame now the belt wakes up the inserter every item. A lot of the time inserters wait on their destination and not on their source. E.g. waiting for space in a chest. But every items going by on the belt will wake it up. There isn't an item every tick so that is still a win, but ...

The next question becomes: Do you unsubscribe the inserter from the belt when it is blocked at the other end? If you make subscribing cheap then the inserter can subscribe to the source or destination depending on which end it is waiting on. Hey, you can even subscribe to the power pole if it is out of energy.

The other question is: How expensive is emiting the vent when nothing is connected to it? There are a lot of belts with no inserter next to it. They all need to check if they have to wake one up when a new item comes by just to see that nothing is connected.

It's all a balancing act. The time you save on one end might be wasted on the other. And only testing real games will tell you which side costs more time.

That's why I mentioned the "order of magnitude".

For sending an event I basically envision something like foreach(auto subscription: subscribers) if(subscription.active && subscription.type == triggeredEvent) subscription.entity->triggerEvent(subscription.index); with small array optimization for the list this will only require a pointer deref when it needs to actually send an event.

On the receiving side the entity can check only the events it's blocked on before double checking remote entity state. That way an inserter won't have to check the belt each time a new item comes along it when the chest it's trying to fill is already full and didn't change. The active field in subscription will let an entity temporarily unsubscribe from an event. With the segments implementation the segment can be spit up in a way to avoid too many inserters listening to the same segment.

For example for the inserter there are 4 distinct states, waiting on pickup, rotating to dropoff, waiting on dropoff, rotating to pickup.

in the rotation states they only need to wait on a power-actuated timer.

in the waiting on pickup state it needs to wait until the receiving entity has room for an item and subscribes to inventory change events from that receiving entity (along with destroy and create for the location) and for inventory change events on the pickup side (again with destroy and create).

For waiting on dropoff it only needs to listen for inventory changes (and again create and destroy) in the receiving entity.

Post by **Rseding91** » Thu Feb 16, 2017 6:47 pm

ratchetfreak wrote:That's why I mentioned the "order of magnitude".

For sending an event I basically envision something like foreach(auto subscription: subscribers) if(subscription.active && subscription.type == triggeredEvent) subscription.entity->triggerEvent(subscription.index); with small array optimization for the list this will only require a pointer deref when it needs to actually send an event.

On the receiving side the entity can check only the events it's blocked on before double checking remote entity state. That way an inserter won't have to check the belt each time a new item comes along it when the chest it's trying to fill is already full and didn't change. The active field in subscription will let an entity temporarily unsubscribe from an event. With the segments implementation the segment can be spit up in a way to avoid too many inserters listening to the same segment.

For example for the inserter there are 4 distinct states, waiting on pickup, rotating to dropoff, waiting on dropoff, rotating to pickup.

in the rotation states they only need to wait on a power-actuated timer.

in the waiting on pickup state it needs to wait until the receiving entity has room for an item and subscribes to inventory change events from that receiving entity (along with destroy and create for the location) and for inventory change events on the pickup side (again with destroy and create).

For waiting on dropoff it only needs to listen for inventory changes (and again create and destroy) in the receiving entity.

So pretty much what we already have? This is the assembling machines version:

Code: Select all

void CraftingMachine::postTransferHook(const NotificationData& data)
{
  assert(data.immediateInventory);

  if (data.immediateInventory == &this->moduleInventory)
  {
    if (data.itemID.getPrototype()->isModule())
      this->effectReceiver.rebuildSources(this->moduleInventory, this->boundingBox, this->getSurface());
    else if (data.change > 0)
      LOG_AND_ABORT("Attempting to add non-module item to module inventory.");
  }

  if (!this->isSetup()) // Fast-replaces transfers inventories before setup ... don't hit alarm.
    return;

  if (data.immediateInventory == &this->sourceInventory)
  {
    if (data.change > 0)
    {
      /** The input inventory changed positive: if the CraftingMachine is not inactive from lack of power, alarm it.*/
      if (this->sleepReason != ProductionResult::NoPower)
        this->alarm();
    }
    /** The input inventory changed negative: alarm input inserters to possibly add more items.*/
    else if (data.change < 0)
      this->inputWakeUpList.alarm();
  }
  else if (data.immediateInventory == &this->resultInventory)
  {
    if (data.change > 0)
    {
      if (this->sleepReason != ProductionResult::ItemProductionOverload)
        this->outputWakeUpList.alarm();
    }
    else if (data.change < 0)
    {
      /** Only alarm the CraftingMachine if it was inactive from production overload.*/
      if (this->sleepReason == ProductionResult::ItemProductionOverload)
        this->alarm();
      this->inputWakeUpList.alarm();
    }
  }
  else if (data.immediateInventory == this->energySource->getFuelInventory())
  {
    /** Only alarm the CraftingMachine when its fuel contents where changed if it was inactive from lack of fuel.*/
    if (this->sleepReason == ProductionResult::NoPower)
      this->alarm();
    /** If fuel was removed from the fuel inventory alarm input inserters to possibly add more.*/
    if (data.change < 0)
      this->inputWakeUpList.alarm();
  }
}

ratchetfreak · Post by **ratchetfreak** » Thu Feb 16, 2017 9:25 pm

Rseding91 wrote: So pretty much what we already have? This is the assembling machines version:

looks like it ...

Only addition is that alarm could take a parameter to be able to early out in case it's not something the entity should wake up for.

GeniusIsme · Post by **GeniusIsme** » Sat Feb 18, 2017 11:55 pm

Rseding91 wrote: We don't currently make wide use of memory pools and as such we don't have memory leaks, off-by-one errors, or memory-access related problems and I want it to stay that way.

Have you started, by any chance, to use smart pointers?

Factorio Forums

Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications

Re: Parrallel processing in games & applications