Debugging and Troubleshooting

Common errors

  1. If TTP displays `KeyError:"Data"``, it means that the error lies in the worker. Check the worker for errors. Errors here are generally the result of mismatched data submitted during the "connect" phase. You can clean the directories where local data is stored. Remove all the intermediate result folders.

Before cleaning the directories

before_clearing_the_directories

After cleaning the directories

after_clearning_the_directories

  1. However, if you receive a websocket error, then it means that the federated grid itself has gone down. Errors range from inappropriate parameters submitted that are incompatible with PyTorch, or are simply incompatible with each other (eg. trying to use BCELoss on multi-class predictions), to websocket freezing due to orphaned coroutines. Apply the compatible settings. It usually also helps to terminate all containers and restart after applying the chnages.

Terminating and pruning the container

When facing errors, it is advised to terminate and relaunch the container after making the necessary changes. They can be done using the following commands.

docker stop ttp worker_1 worker_2
docker container prune

In case you encounter other errors while using Synergos, please raise them in our community group so that the community could also benefit from your findings.

results matching ""

    No results matching ""