You are not logged in.

#1 2024-04-24 20:54:48

MrVideo
Member
From: The Internet
Registered: 2023-12-05
Posts: 8

ROS nodes killed by OOMK in Docker container

Hi, I'm having a problem with a Docker container that I know for a fact works on other OSs but not on Arch.

I am trying to run some ROS nodes on a ros-noetic container that I made (we must use ROS1, not ROS2, because of University).

I have tried running this container on my Mac (macOS Ventura, M1 Pro chip, aarch64) and it works as expected, while on my Arch PC (which is x64) the ROS nodes are killed by the OOMK almost immediately.

When I run the command:

roslaunch package launch.launch

I get the following output:

... logging to /root/.ros/log/52cc7d8c-027a-11ef-aefe-0242ac170003/roslaunch-5f3803c1ba59-620.log
Checking log directory for disk usage. This may take a while.
Press Ctrl-C to interrupt
Done checking log file disk usage. Usage is <1GB.

started roslaunch server http://5f3803c1ba59:37523/

SUMMARY
========

PARAMETERS
 * /rosdistro: noetic
 * /rosversion: 1.16.0

NODES
  /gps_to_odom/
    deg2rad (first_project/deg2rad)
    ecef2enu (first_project/ecef2enu)
    gps_to_odom (first_project/gps_to_odom)
    heading (first_project/heading)
    heading2quat (first_project/heading2quat)
    lla2ecef (first_project/lla2ecef)
    n_func (first_project/n_func)

ROS_MASTER_URI=http://localhost:11311

process[gps_to_odom/gps_to_odom-1]: started with pid [634]
process[gps_to_odom/deg2rad-2]: started with pid [635]
process[gps_to_odom/n_func-3]: started with pid [636]
process[gps_to_odom/lla2ecef-4]: started with pid [637]
process[gps_to_odom/ecef2enu-5]: started with pid [638]
process[gps_to_odom/heading-6]: started with pid [639]
process[gps_to_odom/heading2quat-7]: started with pid [640]
[gps_to_odom/ecef2enu-5] process has died [pid 638, exit code -9, cmd /root/robotics/devel/lib/first_project/ecef2enu __name:=ecef2enu __log:=/root/.ros/log/52cc7d8c-027a-11ef-aefe-0242ac170003/gps_to_odom-ecef2enu-5.log].
log file: /root/.ros/log/52cc7d8c-027a-11ef-aefe-0242ac170003/gps_to_odom-ecef2enu-5*.log
[gps_to_odom/lla2ecef-4] process has died [pid 637, exit code -9, cmd /root/robotics/devel/lib/first_project/lla2ecef __name:=lla2ecef __log:=/root/.ros/log/52cc7d8c-027a-11ef-aefe-0242ac170003/gps_to_odom-lla2ecef-4.log].
log file: /root/.ros/log/52cc7d8c-027a-11ef-aefe-0242ac170003/gps_to_odom-lla2ecef-4*.log
[gps_to_odom/heading2quat-7] process has died [pid 640, exit code -9, cmd /root/robotics/devel/lib/first_project/heading2quat __name:=heading2quat __log:=/root/.ros/log/52cc7d8c-027a-11ef-aefe-0242ac170003/gps_to_odom-heading2quat-7.log].
log file: /root/.ros/log/52cc7d8c-027a-11ef-aefe-0242ac170003/gps_to_odom-heading2quat-7*.log
[gps_to_odom/n_func-3] process has died [pid 636, exit code -9, cmd /root/robotics/devel/lib/first_project/n_func __name:=n_func __log:=/root/.ros/log/52cc7d8c-027a-11ef-aefe-0242ac170003/gps_to_odom-n_func-3.log].
log file: /root/.ros/log/52cc7d8c-027a-11ef-aefe-0242ac170003/gps_to_odom-n_func-3*.log
[gps_to_odom/gps_to_odom-1] process has died [pid 634, exit code -9, cmd /root/robotics/devel/lib/first_project/gps_to_odom __name:=gps_to_odom __log:=/root/.ros/log/52cc7d8c-027a-11ef-aefe-0242ac170003/gps_to_odom-gps_to_odom-1.log].
log file: /root/.ros/log/52cc7d8c-027a-11ef-aefe-0242ac170003/gps_to_odom-gps_to_odom-1*.log
[gps_to_odom/deg2rad-2] process has died [pid 635, exit code -9, cmd /root/robotics/devel/lib/first_project/deg2rad __name:=deg2rad __log:=/root/.ros/log/52cc7d8c-027a-11ef-aefe-0242ac170003/gps_to_odom-deg2rad-2.log].
log file: /root/.ros/log/52cc7d8c-027a-11ef-aefe-0242ac170003/gps_to_odom-deg2rad-2*.log
[gps_to_odom/heading-6] process has died [pid 639, exit code -9, cmd /root/robotics/devel/lib/first_project/heading __name:=heading __log:=/root/.ros/log/52cc7d8c-027a-11ef-aefe-0242ac170003/gps_to_odom-heading-6.log].
log file: /root/.ros/log/52cc7d8c-027a-11ef-aefe-0242ac170003/gps_to_odom-heading-6*.log
all processes on machine have died, roslaunch will exit
shutting down processing monitor...
... shutting down processing monitor complete
done

The Dockerfile I'm using is the following:

# Start from the ROS Noetic image
FROM ros:noetic-perception

### ROS INSTALLATION ###

# Install required packages
RUN apt-get update && \
    apt-get install -y ros-noetic-rqt* net-tools iproute2 ros-noetic-plotjuggler* ros-noetic-foxglove-bridge ros-noetic-turtlesim

# Create catkin workspace and add it to .bashrc
RUN mkdir -p /root/robotics/src && \
    echo "source /root/robotics/devel/setup.bash" >> /root/.bashrc
    
# Switch to the directory and initialize the workspace 
RUN /bin/bash -c '. /opt/ros/noetic/setup.bash; cd /root/robotics; catkin_make'

### NEOVIM INSTALLATION ###

# Install build dependencies for Neovim
RUN apt update && apt upgrade -y
RUN apt install -y ninja-build gettext cmake unzip curl build-essential git

# Install ripgrep for live grep in Telescope.nvim
RUN apt install -y ripgrep

# Install clangd-12 for LSP functionality in Neovim and set it as default
RUN apt install -y clangd-12
RUN update-alternatives --install /usr/bin/clangd clangd /usr/bin/clangd-12 100

# Move to home directory
WORKDIR /root/

# Clone Neovim from source
RUN git clone https://github.com/neovim/neovim

# Move into the Neovim directory
WORKDIR /root/neovim

# Select stable version
RUN git checkout stable

# Build and install
RUN make CMAKE_BUILD_TYPE=RelWithDebInfo
RUN make install

### NEOVIM CONFIGURATION ###

# Move to home directory
WORKDIR /root/

# Create .config directory
RUN mkdir .config

# Move to config directory
WORKDIR /root/.config/

# Clone configs for Neovim
RUN git clone https://github.com/MrVideo/ROSNeovimConfig.git nvim

# Setup clangd with a config file in the working directory
WORKDIR /root/robotics
RUN echo "CompileFlags:" >> .clangd
RUN echo "  Add: [-I/opt/ros/noetic/include, -L/opt/ros/noetic/lib, -I/root/robotics/devel/include/]" >> .clangd

# Start ROS core services
CMD ["roscore"]

And the docker-compose.yml is this:

version: '3.8'
services:
  ros:
    container_name: ros
    build:
      context: .
    environment:
      - DISPLAY=novnc:0.0
    networks:
      - rosnet
    volumes:
      - rosvol:/root/robotics/
      - nvim_cache:/root/.local/share/nvim/
    deploy:
      resources:
        limits:
          cpus: '3.00'
          memory: 4G
  novnc:
    container_name: novnc
    image: theasp/novnc
    environment:
      - DISPLAY_WIDTH=1512
      - DISPLAY_HEIGHT=982
      - RUN_XTERM=no
    ports:
      - "8080:8080"
    networks:
      - rosnet
    ulimits:
      nofile:
        soft: 65536
        hard: 65536

networks:
  rosnet:
    driver: bridge

volumes:
  rosvol:
  nvim_cache:

Keep in mind that the deploy.resources.limits property for the ros container was not there before: I put it there just to avoid my PC crashing while running ROS nodes and testing for possible solutions. It doesn't matter how much memory you put there, I have 32 GB of RAM and ROS filled it completely, including my 8 GB Swap partition.

When I run the roslaunch command on my Mac nothing special happens and all nodes behave as expected. I have also tried running other standard nodes, like turtlesim turtlesim_node, and they are all killed with exit code 9, which indicates OOMK.

I have also tried using an Ubuntu 22.04.4 VM and it seems to be working fine there as well.

Note that I cannot share my code because of anti-plagarism mechanisms my University could use when evaluating my project, sorry about the inconvenience. If our project would get flunked because of this, my project mates would be pretty mad at me.

I would be really glad if someone could help me with this, I've bashed my head on this problem all day long. Thank you smile


I just try not to duck it up

Offline

#2 2024-05-03 13:41:47

Awebb
Member
Registered: 2010-05-06
Posts: 6,311

Re: ROS nodes killed by OOMK in Docker container

If docker works on Ubuntu and not on Arch, then chances are Ubuntu has either some Docker or Kernel configs that Arch hasn't or vice versa.

Offline

#3 2024-05-03 14:14:37

MrVideo
Member
From: The Internet
Registered: 2023-12-05
Posts: 8

Re: ROS nodes killed by OOMK in Docker container

Yeah that's what I was thinking as well, but I have no idea how to check which are different and which actually matter for this problem


I just try not to duck it up

Offline

Board footer

Powered by FluxBB