Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High probability crash of 800 agent simulation #64

Open
kabirkbr opened this issue Oct 27, 2018 · 0 comments
Open

High probability crash of 800 agent simulation #64

kabirkbr opened this issue Oct 27, 2018 · 0 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@kabirkbr
Copy link
Collaborator

When running src/test/groovy/Experiments.experimentOne with 800 agents, almost always the run experiences one or two crashes, similar to this one:

Crashed simulation:

cat OfferNet.log | grep experimentId | tail -n 1:

2018-10-27 08:53:52,023 [Test worker] INFO  Experiments.class  : 
experimentId=EXP10-26-01-05-AJag2Q : simulationId=SIM10-27-08-53-Kt5aMA--DV : 
keyword=simulationParameters : agentNumber=800 : chainLength=5 : 
randomWorksNumberMultiplier=2 : maxDistance=30 : similaritySearchThreshold=1 : 
similarityConnectThreshold=0.93 : graphType=smallWorld

Number of simulations:

cat OfferNet.log | grep experimentId | grep -v timeout | wc -l:

24

Console output:

io.singularitynet.offernet.Experiments > experimentOne FAILED
    java.util.concurrent.TimeoutException: Futures timed out after [30 seconds]
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:255)
        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:259)
        at scala.concurrent.Await$.$anonfun$result$1(package.scala:215)
        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:142)
        at scala.concurrent.Await.result(package.scala)
        at scala.concurrent.Await$result.call(Unknown Source)
        at io.singularitynet.offernet.Simulation.getAgentVertexId(Simulation.groovy:151)
        at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:210)
        at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:59)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:166)
        at io.singularitynet.offernet.Simulation.createAgentWithId(Simulation.groovy:125)
        at sun.reflect.GeneratedMethodAccessor71.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1213)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
        at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.callCurrent(PogoMetaClassSite.java:69)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:166)
        at io.singularitynet.offernet.Simulation$_recreateAgents_closure1.doCall(Simulation.groovy:138)
        at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
        at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:294)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
        at groovy.lang.Closure.call(Closure.java:414)
        at groovy.lang.Closure.call(Closure.java:430)
        at org.codehaus.groovy.runtime.DefaultGroovyMethods.each(DefaultGroovyMethods.java:2040)
        at org.codehaus.groovy.runtime.DefaultGroovyMethods.each(DefaultGroovyMethods.java:2025)
        at org.codehaus.groovy.runtime.DefaultGroovyMethods.each(DefaultGroovyMethods.java:2066)
        at org.codehaus.groovy.runtime.dgm$162.invoke(Unknown Source)
        at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
        at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
        at io.singularitynet.offernet.Simulation.recreateAgents(Simulation.groovy:137)
        at io.singularitynet.offernet.Simulation$recreateAgents$5.call(Unknown Source)
        at io.singularitynet.offernet.Experiments.recreateActorSystem(Experiments.groovy:200)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1213)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
        at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.callCurrent(PogoMetaClassSite.java:69)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:166)
        at io.singularitynet.offernet.Simulation$_recreateAgents_closure1.doCall(Simulation.groovy:138)
        at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
        at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:294)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
        at groovy.lang.Closure.call(Closure.java:414)
        at groovy.lang.Closure.call(Closure.java:430)
        at org.codehaus.groovy.runtime.DefaultGroovyMethods.each(DefaultGroovyMethods.java:2040)
        at org.codehaus.groovy.runtime.DefaultGroovyMethods.each(DefaultGroovyMethods.java:2025)
        at org.codehaus.groovy.runtime.DefaultGroovyMethods.each(DefaultGroovyMethods.java:2066)
        at org.codehaus.groovy.runtime.dgm$162.invoke(Unknown Source)
        at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
        at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
        at io.singularitynet.offernet.Simulation.recreateAgents(Simulation.groovy:137)
        at io.singularitynet.offernet.Simulation$recreateAgents$5.call(Unknown Source)
        at io.singularitynet.offernet.Experiments.recreateActorSystem(Experiments.groovy:200)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
        at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:384)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
        at org.codehaus.groovy.runtime.callsite.PogoMetaClassSite.callCurrent(PogoMetaClassSite.java:69)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:174)
        at io.singularitynet.offernet.Experiments$_experimentOne_closure1$_closure2$_closure3$_closure4$_closure5$_closure6$_closure7.doCall(Experiments.groovy:160)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:93)
        at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:325)
        at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:294)
        at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1022)
        at groovy.lang.Closure.call(Closure.java:414)
        at groovy.lang.Closure.call(Closure.java:430)
        at org.codehaus.groovy.runtime.DefaultGroovyMethods.each(DefaultGroovyMethods.java:2040)
        at org.codehaus.groovy.runtime.DefaultGroovyMethods.each(DefaultGroovyMethods.java:2025)
        at org.codehaus.groovy.runtime.DefaultGroovyMethods.each(DefaultGroovyMethods.java:2066)
        at org.codehaus.groovy.runtime.dgm$162.invoke(Unknown Source)
        at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite$PojoMetaMethodSiteNoUnwrapNoCoerce.invoke(PojoMetaMethodSite.java:274)
        at org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)
        at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
        at io.singularitynet.offernet.Experiments$_experimentOne_closure1$_closure2$_closure3$_closure4$_closure5$_closure6.doCall(Experiments.groovy:101)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[...]

Here is the end of OfferNet.log:

cat OfferNet.log | tail -n 200:

2018-10-27 09:01:55,078 [OfferNet-akka.actor.default-dispatcher-3] INFO  Agent.class  : method=searchAndConnect : simulationId=SIM10-27-08-53-Kt5aMA--CV : agentId=964460d7-f488-4bc4-aec6-4e7d7e1d70a1 ; similarityThreshold=0.93 ; maxReachDistance=30 ; connectionsCreated=228 : wallTime_ms=41022 msec.
2018-10-27 09:01:55,085 [OfferNet-akka.actor.default-dispatcher-3] INFO  akka.actor.RepointableActorRef  : Message [io.singularitynet.offernet.Method] from TestActor[akka://SimulationTests/user/$$w] to Actor[akka://OfferNet/user/ceec979a-517d-4b57-beb4-158c60e215b6#588020435] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
2018-10-27 09:01:55,085 [OfferNet-akka.actor.default-dispatcher-3] INFO  akka.actor.RepointableActorRef  : Message [io.singularitynet.offernet.Method] from TestActor[akka://SimulationTests/user/$$w] to Actor[akka://OfferNet/user/ed148cf6-109e-4e9d-a1d8-15cc087ea49c#242017620] was not delivered. [2] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
2018-10-27 09:01:55,085 [OfferNet-akka.actor.default-dispatcher-3] INFO  akka.actor.RepointableActorRef  : Message [io.singularitynet.offernet.Method] from TestActor[akka://SimulationTests/user/$$w] to Actor[akka://OfferNet/user/04c5c33f-35d7-45cd-a9c3-f08e1b51e274#-2135737520] was not delivered. [3] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
2018-10-27 09:01:55,085 [OfferNet-akka.actor.default-dispatcher-3] INFO  akka.actor.RepointableActorRef  : Message [io.singularitynet.offernet.Method] from TestActor[akka://SimulationTests/user/$$w] to Actor[akka://OfferNet/user/c4628a7d-9439-4447-842a-007e7f96ec32#-1072501352] was not delivered. [4] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
2018-10-27 09:01:55,085 [OfferNet-akka.actor.default-dispatcher-3] INFO  akka.actor.RepointableActorRef  : Message [io.singularitynet.offernet.Method] from TestActor[akka://SimulationTests/user/$$w] to Actor[akka://OfferNet/user/c532470b-30ee-4bdd-98cf-863cd9f293ae#848325729] was not delivered. [5] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
2018-10-27 09:01:55,086 [OfferNet-akka.actor.default-dispatcher-3] INFO  akka.actor.RepointableActorRef  : Message [io.singularitynet.offernet.Method] from TestActor[akka://SimulationTests/user/$$w] to Actor[akka://OfferNet/user/f6d1fed0-19b8-4543-b839-f5b368bb366b#-1904560384] was not delivered. [6] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
2018-10-27 09:01:55,086 [OfferNet-akka.actor.default-dispatcher-3] INFO  akka.actor.RepointableActorRef  : Message [io.singularitynet.offernet.Method] from TestActor[akka://SimulationTests/user/$$w] to Actor[akka://OfferNet/user/fefce533-0fd2-42ac-818f-a3d653a1c3c2#-462127024] was not delivered. [7] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
2018-10-27 09:01:55,087 [OfferNet-akka.actor.default-dispatcher-3] INFO  akka.actor.RepointableActorRef  : Message [io.singularitynet.offernet.Method] from TestActor[akka://SimulationTests/user/$$w] to Actor[akka://OfferNet/user/ca3adcfc-dbd1-474d-af69-e6ae761f8a53#1456391419] was not delivered. [8] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
2018-10-27 09:01:55,087 [OfferNet-akka.actor.default-dispatcher-3] INFO  akka.actor.RepointableActorRef  : Message [io.singularitynet.offernet.Method] from TestActor[akka://SimulationTests/user/$$w] to Actor[akka://OfferNet/user/8acab3f1-ce67-4fde-a37a-22aa1a194ef6#-245159385] was not delivered. [9] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
2018-10-27 09:01:55,087 [OfferNet-akka.actor.default-dispatcher-3] INFO  akka.actor.RepointableActorRef  : Message [io.singularitynet.offernet.Method] from TestActor[akka://SimulationTests/user/$$w] to Actor[akka://OfferNet/user/b5772ac6-f250-47c5-a5ef-476a898e3694#1519662574] was not delivered. [10] dead letters encountered, no more dead letters will be logged. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
2018-10-27 09:01:55,776 [OfferNet-akka.actor.default-dispatcher-35] INFO  Agent.class  : method=searchAndConnect : simulationId=SIM10-27-08-53-Kt5aMA--CV : agentId=5d3c9f72-e5ce-4f29-a28c-bd4f8af6baba ; similarityThreshold=0.93 ; maxReachDistance=30 ; connectionsCreated=1031 : wallTime_ms=45321 msec.

It seems that the problem is related to recreating agents, since the simulation crashes after creating only two:

cat OfferNet.log | tail -n 10000 | grep init:

2018-10-27 08:53:52,011 [OfferNet-akka.actor.default-dispatcher-27] INFO  Agent.class  : method=<init> : simulationId=SIM10-27-08-53-Kt5aMA--DV : agentId=b51bfe76-5e5f-443e-912a-af01f11b6a62 : wallTime_ms=53 msec.
2018-10-27 09:00:28,705 [Test worker] INFO  OfferNet.class  : method=<init> : simulationId=SIM10-27-08-53-Kt5aMA--DV : wallTime_ms=139776 msec.
2018-10-27 09:00:28,705 [Test worker] INFO  Simulation.class  : method=<init> : simulationId=SIM10-27-08-53-Kt5aMA--CV : wallTime_ms=139776 msec.
2018-10-27 09:01:16,066 [OfferNet-akka.actor.default-dispatcher-6] INFO  Agent.class  : method=<init> : simulationId=SIM10-27-08-53-Kt5aMA--CV : agentId=90ccee44-e86f-40b8-9e5d-0d9f2171c893 : wallTime_ms=20056 msec.
2018-10-27 09:01:50,192 [OfferNet-akka.actor.default-dispatcher-6] INFO  Agent.class  : method=<init> : simulationId=SIM10-27-08-53-Kt5aMA--CV : agentId=843bcdbb-54e7-4f4d-bda6-cff693bb577a : wallTime_ms=34125 msec.

Bottom line, it seems that all this is related to graph database, since restarting DSE cluster makes the crashed simulation pass, but then it crashes the same way after 20-30 simulations. The only hypothesis that I come up now is that graph database server gets 'tired' / does not manage to garbage collect / clean cashes fast enough and therefore at certain point it does not manage to fit into timeout when recreating graphs. Getting rid of all Actor calls with timeouts may help -- after all it is a design flaw... But then, the whole problem may go away when turning only to decentralized simulations.

@kabirkbr kabirkbr added the bug Something isn't working label Oct 27, 2018
@kabirkbr kabirkbr added this to the v0.1.0 milestone Oct 27, 2018
@kabirkbr kabirkbr self-assigned this Oct 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant