More Sugar please

sugarscape
multi-agent simulation
agent-based modeling
agent-based social simulation
multi-agent reinforcement learning
sequential social dilemmas
cooperation
Markov games
non-cooperative games
complex signaling system
Author

Oren Bochman

Published

Tuesday, June 11, 2024

In Sugar ad Astra I covered some research questions I had about Sugarscape. Today I’d like to mediate on some ideas I have for extending sugarscape to make it more of curriculum learning environment for multi-agent reinforcement learning, where agents can have many and varied lifelong interactions with the environment and other agents.

If agents also learn to communicate, thier emergent languages under this settings should be more sophisticated and provide some inkling of how natural languages might have evolved.

Some ideas are about extending the environment to add more emergent properties. Other ideas are more about agents learning to generalize about the environment by providing a richer and more complex environment to handle.

Amazing landscapes

SandPile Sugarscape

In (Bak, Tang, and Wiesenfeld 1988) the authors introduced the abelian sandpile model. They showed that a simple model of sandpile could generate complex landscapes.

Bak, P., Chao Tang, and K. Wiesenfeld. 1988. “Self-Organized Criticality.” Physical Review. A, General Physics 38 1: 364–74.

We could incorporate this model to generate the landscape of sugarscape. In this version rigion with sugar are created by dropping sugar on the landscape. The sugar will pile up in some regions and create hills and valleys. The agents will then need to navigate this complex landscape to gather resources. If they are not careful they might fall into a valley and be unable to get out. If they are unlucky they might be caught in a landslide and be buried in sugar.

Xland Sugarscape

The paper (Team et al. 2021) described a new approach to learning called Open-Ended Learning. In it an functional genrated environment was descibed called Xland. I think that one way to make sugarscape more more challanding is by composing more complex landscapes.

Team, Open-Ended Learning, Adam Stooke, Anuj Mahajan, C. Barros, Charlie Deck, Jakob Bauer, Jakub Sygnowski, et al. 2021. “Open-Ended Learning Leads to Generally Capable Agents.” ArXiv abs/2107.12808.

Xland uses Quantum Wave Collapse algorithm to generate 3d landscapes. This is a type of procedural generation.

When a the agents needs to navigate a complex lanscape to gather resources survival becomes more challenging. This in can lead to more importance for trade between agent and even between communities. If things get really tough agents might prefer combat to trade.

  • A more complex landscape might also include more layers than Xland.
  • Different layers could correspond to different (mineral or farming) resources.

Ecological Sugarscape

  • Adding to the landscapes tiles for different biomes to the xland landscape at different elevations.

  • An ecology is a complex system of interdependent organisms.

  • We can model this using a predator-prey models The different biomes have animals in a predator-prey relationship. They keep the population of each other in check and husband the resources of the land.

    Thier populations are governed by lekrotta-volterra differntial equations.

    Such Pairs might be:

    • For glaciers: Yettis and penguins
    • For mountains: Eagles and
    • For rivers: Bears and salmon
    • For lakes: Kingfishers and fish
    • For islands: Sharks and Seals
    • For deserts: Snakes, scorpions and spiders
    • For forests: Wolves and deer and rabbits
    • For swamps: Alligators and frogs and flies
    • For caves: Bats and insects
    • For volcanoes: Lava and rocks (credit: to (GH-Copilot?) for this one)
    • For plains: Lions and wildebeest
    • For Savannah: Elephants and antelopes

    As one is hunted e.g. stag hunt targetting deer will boost the numbers of rabbits if the number of deer is too small they might be decimated by wolves and the rabbits. One of both of these may emerge from thier habitat and attack the agents or thier crops.

    • Agents would need to learn a model of the ecology to avoid disrupting it.
    • Agent could hunt stags, wolves and rabbits in proportion to thier population.
      • Hunting wolves is risky
      • Hunting rabbits means defaulting in a stag hunt with loss of reputation.
    • Agents might learn to domesticate wolves, deer and rabbits.

    If agent disrupts the ecology they might find themselves facing any number of ecological disasters.

    • A plague of locusts might eat all the crops.
    • A plague of rats might consume all their stores of sugar.
    • A flight of bats might start a pandemic.
    • Cutting down trees might lead to deforestation, erosion and desertification.
      • erosion would remove the river and can increase risks of a flood.
      • erosion would reduce the carrying capacity of the land by reducing max_growth to 3
      • this would elimanate the salmon and enrages the bears.
      • bears are almost impossible to hunt and will attack agents and their livestock.
      • bear hunt requires a coordinated effort of agents to drive the bear into a trap.
      • deforestation would reduce the carrying capacity of the land by reducing max_growth to 2
      • desertification would reduce the carrying capacity of the land by reucing max_growth to
  • Each biome might have supply different resources as well as different challenges.

  • Agents must learn how each biomes operates to survive and possibly increase its yields.

  • Cutting down certain trees might reduce the risk of fire in a forest biome.

  • Cutting down others might lead to deforestation, erosion and desertification.

  • Hunting can create a food source, furs, and bones for tools

  • Hunting can also lead to burst in the growth of undesired species that will target crops,
    attack ther agents and spread desease.

    • Can agents learn to balance their burgeoning society within a complex ecology?
    • Can they learn to harvest wild resources sustainably?
    • Can they learn to manage the environment to increase the carrying capacity of the land, without disrupting the ecology?
    • Can they learn to hunt without causing the extinction of the prey or it’s predators?
    • Agent that disrupt habitats may introduce a pandemic.
    • Cutting down trees can lead to deforestation, reduction of rain, fires and famine.
    • Over fishing can lead to the extinction of fish species.
    • Can they domesticate animals? -Raising too many cattle can lead to desertification.
  • Nature could be made less and less forgiving to our agents.

    • droughts, fires, floods, hurricanes, tsunamies, earthquakes, avalanches, mudslides, etc.
    • plagues of locusts, rats, bats, that eat crops if their predetors are killed.
    • invasive species like weeds or kazoo that might replace the crops.
      • specialists might emerge that can clean these up.
      • if the root cause is not mitigated the specialists might become a pest and a coordinated clean up the environment would be needed by agents.
      • the area might need a fire to completely burn out the invasive species.
      • pollution would then be generated forcing agents to relocate until it dissipates.
    • apex predators that will emerge to prey on agents and thier livestock if their territory is encroached upon.
    • resources generating land can become less fertile if surrounding ecology is disrupted.

Floods come to Sugarscape

  • It is fairly easy to add a elevation to the environment. We can then use the elevation and a threshold to define a water level. When it rains water level rises. Some ideas about rainfall:

  • Rainfall is local and seasonal so there might be a lake in a mountains with higher water level then say a sea at the lowest elevation.

  • Mountains of high eleveation might have snow that generates a source of water.

  • Water from a lake might flow to a lower elevation by creating a river.

  • Water levels drop if rain stops.

  • Agents might drown if they are in a cell with water level above a certain threshold.

  • Agents might also be able to swim.

  • Agents might be able to build boats - Operating the boat requires two or more agents who must learn to cooperate and coordinate to move the boat.

  • Agents could cooperate to build dams, thus control water flow away from regions of greatest yield. This is a public good which would increase the world’s carrying capacity.

    • four agents are needed to build the dam.
    • the must burn k sugar into caramel to construct the dam.
  • Agents might be able to build bridges. which might decrease the cost of movement, and give access to needed resources, a second public good game.

  • Agents might be able to build canals - these might together with boats would allow more efficient drainage and trading with boats along the canal, a public goods game. If there are many resources a canal system might be needed to allow agents trade efficently so as to survive

  • After a flood Agents might need to cooperate to drain an area to restore land to harvest it to full capacity.

    • Land with lower elevation nearby will drain automatically by one unit for each lower elevation land.
    • land next to fully drained land will drain by one unit.
    • restoring land farming capacity at a rate of k per time step.
    • Land with with no lower elevation will need to be drained by agents.
    • Land with higher elevation will need to be drained by agents.
    • Draining will require three agent to stand together on three sides of the cell to drain it to the open side.
    • Draining will require a certain amount of time. At this time free loaders may use the cleared land to harvest resources. Idealy though the agents should switch roles or better yet share the reources harvested in some way while draining is going on.
    • We can complicate this by making the draining move water from one cell to another until it is pushed to a lower elevation. More agent participate they can drain the area faster. This is a stag hunt game. However as more areas are drained agents might prefer/need to harverst.
    • Eventual there may be more land then agent so that there is no immediate benefits to draining.
    • Draining might be done might need to be done in a coordinated way to avoid flooding other areas. Draining might need to be a stag hunt game where two to four agents must coordinate to drain the area. All must stand on different sides to drain the area.
  • Seasonal flooding can be added by adding a rule that increases the height of water creating islands and complicating movement.

Kinship comes to Sugarscape

The book discusses culture and friend networks. However, besides considering how these night emerge from the rules, it does not consider how they might influence the evolution of the state. I propose to define a more extensions set of kinship relations and to create two mechanisms by which kinship might influence the evolution of the state. I.e. how kinips interacts with other rules.

  • agent will be aware of relatedness of other agent. This will be a number 1,2,4,,8
  • 1 means they are identical twins, 2 means they are siblings, 4 for cousins, 8 for

once kinship ties are available we can have more fine grained social groups.

  • progeny
  • atomic family
  • extended family
  • tribe
  • top-k-friends (k agents that were in FOV longest)
  • culture
  • we may tie other rules to operate within kinship groups via a constraint on the rule.
  • we may tiw behaviour to kinship groups via terms in the utility function that are shared by members of the group.

Social capital, Reputaion & Prestige comes to Sugarscape

Social capital

is a measure of the value of an individual’s social network. It defines the social resources that an individual has access to. Agent with more social capital can get more done. Social capital can be used to get access to resources, information.

Social trust

this is a the trust between two agents. Agents that have not interacted have no ties. Aganets that have positive interactions have strong positive ties Agents that gave negative interactions have strong negative ties. Social trust increases the chance of future cooperation between agents. (For a trade, a public good game, etc an agent will priorotize cooperate with agents they trust.) Trust might become an element of the utility function controlling movement.

Social prestige or Reputation

this is an aggregate measure of the social trust that an individual has. It is a measure of the respect and admiration that an individual has in a society. Social prestige indicates that an agent has access to more social capital than some other individuals.

Complex dillema tasks: - We may introduce complex tasks that require coordination between many agent for the group to survive and thrive. - a flood might require agents to drain an area to restore land to harvest it to full capacity. - an avalanche might require agents to dig out a buried agent quickly. - a fire might require agents to build a fire break to save thier crops. - a war might require agents to coordinate to defeat a stronger enemy. - a shortage of resources might require agents to coordinate to share resources. - group trades might be required when the group exceeds the carrying capacity of the land. - a public good might be required to increase the carrying capacity of the land. - a boat might need three agents to operate. - two rowers and a time keeper. - access to certain resources might require fabricating tools artifacts by specialists. - in a dragon slaying quest a leader might need to mobilize a team of specialists - some wealty patron to finance the quest - a sword smith, a armorer, to outfit fighers - a ranger to find the layer, - k pages to locate keys to open the doors to the dragon’s lair - a knight access the dragon’s lair and then fight it. - each time step the dragon might return and attack a tribe member or cause a fire, reducing the tribe’s carrying capacity. - in games of environtal cleanup agents might need to coordinate to most effectively harvest and clean up thier environment.

Thus in general being righ isn’t enough, an effective leader must be able to marshal resources and mobilize a team of allies to get the job done. The agents would also need to plan to get the task done. This would require a more complex decision making process

So we might have simulation where agents just focus on one complex task like in the minigrid environment. Then one might add bridge building curribculum learning, bridge building, cleanup, stag hunt, etc.

Complex tasks might require coordination between coalitions of agents. More so when the tasks might also require access mobilizing a diverse network of resources, by specilist agents then we can see how social capital might be beneficial.

require access to scarce resources. A diverse network of social ties might be beneficial in such a case. Agents with more social capital can get more done. Social capital can be used to get access to resources, information, and social prestige. Social prestige is a measure of the respect and admiration that an individual has in a society. It is a measure of the social capital that an individual has.

Agents with more social prestige can get more done. Social prestige can be used to get access to resources, information, and social prestige. Social prestige is a measure of the respect and admiration that an individual has in a society. It is a measure of the social capital that an individual has.

Agents with more social prestige can get more done. Social prestige can be used to get access to resources, information, and social prestige. Social prestige is a measure of the respect and admiration that an individual has in a society. It is a measure of the social capital that an individual has.

Agents that have and social prestige. Social prestige is a measure of the respect and admiration that an individual has in a society. It is a measure of the social capital that an individual has.

Leadership can arise from different factors. In this case we consider a simple model where leadership is determined by social prestige.

Another idea might be to base leadership on social capital. This can be more intersting if agents liked and disliked each other for different reasons creating a more complex social network. Agents score other agents based on thier past interactions. E.g. Agents that warned others of danger might get a higher score. Agents that cry wolf would get a lower score. The strength of ties would then become a basis for cooperation between agents. Agents with a pristine reputation would be more likely to be trusted by other agents. Presence of such agents might be beneficial in setting public goods games.

How does social prestige arise - it can be based on wealth, age, number of progeny, number of friends, number of followers, number of trades, number of battles won, taking risks for the group, sharing resources, sharing information, participating in public goods games, etc.

  • leaders with the most prestige should be able to make decisions that affect the group. Idealy these decision should be made localy in a way that the that the group can only achieve the goal and get the benefits by working together. This can allow the benefits of having a leader.

  • wolfpack stughunt mini-games. agents must form a wolfpack to hunt a stag.

    Make a group trade, like if to trade with other cultures if to go to war,

In this case I would like to add a rules that quickly set up well known social hierarchies.

  • monarchy where the leadership is passed down from the oldest agent to their oldest direct descendant.
  • autocarcy where the strongest agent beomes the leader.
  • communism where there is no leader but all resources are shared.
  • gerontocracy where the oldest agent in a group is the leader.
  • matriarchy where the oldest living female ancestor is the leader.
  • patriarchy where the oldest living male ancestor is the leader.
  • plutocracy where the wealthiest agent is the leader.
  • oligarchy where agents with the most wealth can use it to gain social prestige. The can share excess resources with others in FOV in exchange for social prestige. Nearby agents should be typically be wealthy too so the process could diffuse to the whole group. Sharing wealth can be via gifts or loans. Gifting might take place in the following heuristic - after you have enough food to survive for k years give all excess to the poorest agent in your FOV. If all agents in FOV have food for k-years collect enough food for k+1 years. This would be a public good game. It should also lead to higher carrying capacity of the world. Oligarchs that finance the lion’s share of a public works should get the lion’s share of social prestige. This would in turn make them more likely to retain their wealth. Oligarchs might get social prestige for writing off a debts. Oligarchs who are significantly wealthier than others many individuals in thier FOV could also lose social prestige. Since they are not sharing thier wealth. This would mean that in an afluent society the oligarchs might want to have many children also agents that have children might get large benefits by being next to other oligarcs who would be more likely to share ther wealth with them after giving birth. This might make raising children a type of public good game….
  • Democracy where agents vote on group descions. This could be a simple majority vote.
    • voting takes place in a neighborhoods
    • voting within FOV
    • voting within tribe neighborhood
    • voting within culture neighborhood
    • voting within a db-scan cluster.
    • agents vote are a function of friendship ties, kinshop ties and social prestige.
  • meritocracy where greatest ability and initiative are rewarded with leadership.
  • theocracy where the agent with the most religious prestige is the leader.

leadership should have some advantages over non-leaders. This could be in the form of - access to more information - ability to make group trades - ability to allocate roles - in a group task like draining a flooded area. - send agents to explore - send agents to trade - send agents to war - send agents to build a public good - send agents to harvest different resources

  • in this case all agents that comply with the leader’s request should get social prestige.
  • ability to make group decisions like
    • going to war or making peace with another group.
    • trade with another group.

It would be more interesting if leadership arose by one or a combinations of social prestidge, martial prowess and wealth.

also by economic power. This would allow for a more complex social structure where leaders can be challenged by other agents. This would be a public goods game where agents would need to cooperate to challenge the leader. This would be a

Reproduction rule variants

Asexual reproduction comes to sugarscape.

  • agent with an asex_counter>0 can reproduce asxually - decremnting the counter, splitting resources and reducing thier age.
    • splitting has a counter and the abilty is lost
  • this can only be done a finite number of time. Thus if an agent with max age 100 can split 5 times. He can live till 5*50+50=300
  • at which point he will have 32 syblings all age 300 so he has lived 3600 year. In between he may have also reproduced sexually.
  • this makes all his progeny subject to inderitance rule.

Haploid reproduction comes to sugarscape

  • kinship rules based on social insects are introduced here.
    • queens, sister-workers, warrior-males, etc
  • higher kinship should confer altruism in trade
  • higher kinship should confer altruistic sharing of information.

Child rearing comes to Sugarscape

  • we start by adding progeny welfare into an agents utility.
  • we could have an exponent payoffs for progeny that falls to a min on reaching puberty.
  • we could add this to grandparents.
  • we could penalize ancestors whose children are out of vision.

ancestors can expand resources/time to nature their progeny/descendants.

  1. teaching a language by talking. children learn a language by talking with nearby agents.
    • parent can speed this up by talking to their children.
    • This will also impart their culture for free.
  2. after teaching the language the parent can teach their children to trade, fight. etc. this will change the probability of a successful transitions to a reward in these actions.
  3. bootstrap their priors by talking to them.
  4. bootstrap their neural networks by transferring weights.

Language comes to Sugarscape

  • agent can communicate
    • rss in their view
    • agent in their view
    • prices in their view
    • strategies
  • value of info = payoffs of action with info - payoffs of action without info
    • trading in information.
  • memory should have costs

to communicate agents must be in FOV. they first need to learn a signaling system. they can also learn communication protocol they can also learn a grammar to allow a complex signaling system. This can be used to trade information. Agents can trade information about the world. This can be used to coordinate actions, trade, etc.

Information comes to Sugarscape

with a signaling systems agents can exchange information about the world. This can be used to coordinate actions, trade, etc.

information can be traded for resources, social prestige, etc.

Economic Ideas

Specilization comes to Sugarscape - Specilization I

  • each agent has either a probability of success or an output level associated with thier abilities.
  • agents who specilize can generate greater wealth and then trade for the missing commodeties.
  • agents can also learn to trade thier specilized goods for other goods.

Art comes to Sugarscape - Specilization II

Art is like artifacts allowing occupational specialization without conferring special intrinsics it can be traded in a Kulla ring system where the art is passed around the group gaining value as it is passed around. It might confer social prestige to its current owner.

There are as many type of art as that of artists. Artist can charge more for new art. The value is based on the value of their previous art and the number of art they have created.

Artifacts comes to sugarscape - Specilization III

  • agents can have a probability of success in harvesting different resources
  • agents that can create less pollution when harvesting
  • agents that are better at combat
  • agents that are better at trade
  • agents can create artworks that can be traded.
    • art confers social prestige
    • art is traded in a kulla ring system where the art is passed around the group gaining value as it is passed around.
  • agents can create artifacts that can be traded
    • artifacts can be traded for resources and social prestige
    • artifacts may confer an intinsic bonus
    • artifacts may be needed to achieve goals
      • glasses can boost FOV.
      • a telescope gives a bigger boost.
      • a fan might slow down metabolism.
      • fertility artifacts may be needed to reproduce
      • a shovel might speed up draining a flooded area
      • weapons and armor could boost chance to win battles
      • love potion may be needed to reproduce with a specific mate.
      • an artifact may be needed to build a boat
      • a key may be needed to access a new area
      • artifacts may be needed to build other artifacts
      • shoes might speed up movement/reduce metabolic costs
      • a map might be used to keep track of high resources regions
      • lexicon might be used to quickly learn the new signaling systems of other cultures once the first signaling system is learned.
    • artifacts may be needed to build a dam, channel, bridge, boar, etc.
  • agents can have a probability of success in combat

Trade rule variants

Citation

BibTeX citation:
@online{bochman2024,
  author = {Bochman, Oren},
  title = {More {Sugar} Please},
  date = {2024-06-11},
  url = {https://orenbochman.github.io/posts/2024/2024-06-11-more-sugar-please/},
  langid = {en}
}
For attribution, please cite this work as:
Bochman, Oren. 2024. “More Sugar Please.” June 11, 2024. https://orenbochman.github.io/posts/2024/2024-06-11-more-sugar-please/.