Sunday, 25 December 2016

Rule-Based Fault Management for Environmental Monitoring IoT system


Fault Management is an important part of the IoT Management area in general. And several approaches exist to fault detection and diagnosis in information systems. For a 3.0 edition of the Challenge I’m working on a Fault Management solution for a distributed Environmental Monitoring IoT system. The solution is based on a Rule-Based principles of errors (symptoms) detection and faults isolation and diagnose.

Components of the system

The solution will be implemented and deployed as a distributed system, composed of several components. Here is short description of the main components.
Component view of the solution

1. Sensor component, consisting of two identical battery powered WiFi enabled IoT sensors based on ESP8266 and BME280. The component provides temperature, humidity a barometric pressure values using push/pull communication schema with its field gateway via MQTT protocol.

2. IoT Gateway component, based on Raspberry Pi component, running Raspbian OS. RPI has a wireless WiFi USB dongle connected. Eclipse Kura will be installed on the device and used as IoT gateway implementation. The gateway communicates with IoT sensors via MQTT protocol. Also SNMP protocol agent for Raspbian OS will be deployed to enable RPi to receive SNMP commands and send SNMP trap events.

The following components will be deployed on two or more cloud CentOS 7 instances, provided by Vscale. Some components will be run inside Docker containers.

3. Connectivity component, based on Eclipse Hono. The component will provide bidirectional communication channel for the IoT gateway and its cloud backend. Two types of communication protocols for interaction will be used: SNMP and MQTT. Telemetry data from sensors and control commands from the backend will be transmitted via MQTT. SNMP will be used for receiving TRAP and INFORM events from the OS components of the IoT gateway, and for sending GET requests from the backend. For SNMP I’m going to develop SNMP Protocol Adapter for Hono. So from the gateway side there will be two separate data flows: 
sensor readings and monitoring, error events and alarms.

4. Fault Management component, based on JBoss Drools. The aim of this component is to receive symptom events (errors, alarms), detect errors, isolate and diagnose the causes of faults and apply recover activities. For this purpose a set of rules will be created for decision making and complex event processing. The component will be able to send control messages to check the status of other components and to send commands to components as a recovery procedure (trying to restart the failed component for example). The component will be deployed as a self-contained decision service, which communicates with Hono and Data storage components via Apache Camel routes.

5. Data Storage component, implemented via Redis data structure store. Two instances of the data storage will be used: one as the temporary Environmental telemetry data storage and second as the Fault Management database, which persists symptom events, notifications and alerts data.

6. Integration сomponent, based on Apache Camel. This component will implement the business scenario of the solution by receiving Environmental telemetry data (temperature, humidity and barometric pressure values) from the Connectivity component and transmitting this data to the local geoinformation SaaS service - Public Monitoring Project (narodmon.ru) to display sensor readings on the world map.

7. UI component. Will be implemented as a standalone web application. The component will deliver real-time fault and the system status data to the user, providing online visualization and notification functionality using WebSocket protocol.

8. UI client component implemented as HTML5 web client application. The component will display data, provided by the UI component web app.

Sample Use Cases

Use Case 1: Two Environmental data sensors deployed in the field. One sensor is in active mode, periodically sending data readings to its field gateway. Another sensor is a reserve and is in standby mode. The first sensor stops functioning due to the power problems, for example. The Fault Management system detects the situation in which readings data cease to flow from the gateway. The system inits fault isolation procedure by sending status request control message to the malfunctioning sensor. As the sensor doesn’t respond, the system executes recovery activity by sending wake command message to the second sensor. The second sensor switches to the active mode thus continuing the operation of the overall system.

Use Case 2: The Fault Management system receives multiple SNMP trap events with the information on large memory usage by OS of the IoT field gateway. The system sends SNMP GET request to retrieve the OS performance data. The response confirms the bad OS performance. The system executes the recovery procedure by sending the reset command message to the field gateway OS.

At this time I'm working on the Sensor Component hardware and software part.

Friday, 26 February 2016

Hi! Only a few hours left before the end of the Challenge, so in this post I'm going to wrap up the current status of my project. Still I hope this post is not the last one :)

The idea of the project


The idea was to build a smart toy with a main purpose to help babies to develop crawling activity. So what has been done so far?

  Robot chassis

The robot chassis prototype has been built. For the basis of the chassis I'm using Boe-Bot Parts Kit from Parallax:



I've made slight modifications that allow me to mount the Raspberry Pi A+ on the Aluminum Chassis:


Also I've replaced Parallax standard servos from the kit with SM-S4303R continuous rotation servos. SM-S4303R has slightly higher rotation speed. The robot power source is 4xAA NiMh rechargeable battery pack:

 
Since the output voltage of the pack equals 4.8 V and does not provide enough power for Raspberry Pi, servos and other electronic parts, I'm using MT3608 DC-DC Step Up Power Apply Module to increase voltage to 5 V. Also my experiments have shown that Raspberry Pi is very sensitive to current spikes produced by servos and freezes every time when they start to rotate. To fix this I'm decided to use separate circuit to power Raspberry Pi. As a result power circuit contains two MT3608, connected in parallel to the power source:
 
 

Robot configuration highlights

The robot is connected to the wireless router via Wi-Fi wireless USB Adapter Dongle  and configured to use static IP address inside a LAN.
To control the servos of the robot I'm using ServoBlaster driver software. Pins 15 and 16 of the Raspberry Pi header are configured to control servos. So, /dev/servoblaster-cfg configuration file content:

    p1pins=15,16
    p5pins=

    Servo mapping:
         0 on P1-15          GPIO-22
         1 on P1-16          GPIO-23

Robot software

The project sources can be found at https://github.com/sergevas/bcbot. It is a maven-based multi-module project. The module bcbot-robot includes functionality intended to deploy to the Raspberry Pi. Current version of the module includes implementation of several Apache Camel routes and Californium CoAP resourses. The software runs on Raspberry Pi as a standalone Java application. The RobotMain class boots entire application. The class has several dependences:
 Main is used to boot up Camel Context, CoapServer is used to initialize embedded Californium CoAP Server instance.

Robot CoAP resources

Several CoAP resources were implemented and can be found in xyz.sergevas.iot.bcbot.robot.coap package. Current implementation contains hierarchy of the CoAP resources with MoveResource  class as a root resource for the robot motion control. To configure CoAP server org.apache.camel.main.MainListener was implemented:

public static class RobotManagementEvent extends MainListenerSupport {
      
        @Override
        public void afterStart(MainSupport main) {
            try {
                ROBOT_COAP_SERVER.add(new RobotResource("robot", main.getCamelContexts().get(0)));
                LOG.debug("Starting the robot CoAP server...");
                ROBOT_COAP_SERVER.start();
                LOG.debug("The robot CoAP server started...");
            } catch (Exception e) {
                LOG.error("Unable to configure and start the robot CoAP server...", e);
                throw new RuntimeException("Unable to configure and start the robot CoAP server...",
                    e);
            }
        }
      
        @Override
        public void afterStop(MainSupport main) {
            super.afterStop(main);
            LOG.debug("Stopping the robot CoAP server...");
            ROBOT_COAP_SERVER.stop();
        }
    }

Servo control

To control the robot servos Apache Camel routes were implemented in the class ServoRoute. Exec Camel component interacts with the Raspberry Pi file system in order to update /dev/servoblaster file thus controlling the servo speed and rotation direction.  Here is the example of URI that executes echo command with Linux command interpreter:

to("exec:sh?args=-c%20%22echo%20{{servo.right.pin}}={{servo.right.speed.slow.fw}}%20%3E%20"
+ "/dev/servoblaster%20;%20echo%20{{servo.left.pin}}={{servo.left.speed.slow.fw}}%20%3E%20"
+ "/dev/servoblaster%22")

All constant values moved to the properties file. After substitution of values:
exec:sh?args=-c "echo 0=162 > /dev/servoblaster ; echo 0=140 > /dev/servoblaster"  
The routes are called from CoAP resources through the proxy service that is created using Camel ProxyBuilder class:
public class BackwardResource extends CoapResource {
 
private static final Logger LOG = Logger.getLogger(ForwardResource.class);
 
 private RobotMove robotMove;

 public BackwardResource(String name, CamelContext camelContext) throws Exception {
  super(name);
  robotMove = new ProxyBuilder(camelContext).endpoint(ROBOT_MOVE_BACKWARD_ROUTE)
                    .build(RobotMove.class);
 }
 
 @Override
 public void handlePUT(CoapExchange exchange) {
  LOG.debug(format("Start handlde PUT for CoapExchange with text [%s]",
                    exchange.getRequestText()));
  String robotSpeedMode = exchange.getRequestText();
  LOG.debug(
          format("Start Calling Camel route to move the robot backward with speed mode [%s]",
              robotSpeedMode));
  robotMove.move(robotSpeedMode);
  LOG.debug("End calling Camel route...");
  exchange.respond(CHANGED);
 }
}

Deployment model

The project is built and deployed in one click as an executable fat jar using maven Shade Plugin. Linux service wrapper was created to run the robot software on the Raspberry Pi. The service .sh file is created during the deploy maven phase from the Apache Velocity template  every time the build process runs.

Demo

 I have recorded a short video that demonstrates how the robot can be controlled remotely using Copper CoAP user agent for Firefox. You can see the robot resources: 

Currently only PUT CoAP methods are supported by the resources.
As a message body, resources expect a string with the following possible values:
"SLOW", "MEDIUM" and "FAST". These values represent the robot speed modes. The values are propagated as an input message payload to Camel routes and used in content based router EIP implementation:
from(ROBOT_MOVE_FORWARD_ROUTE)
    .choice()
        .when(bodyAs(String.class).isEqualToIgnoreCase(SERVO_SPEED_MODE_SLOW))
            .to("exec:sh?args=-c%20%22echo%20{{servo.right.pin}}={{servo.right.speed.slow.fw}}%20%3E%20/dev/servoblaster"
                + "%20;%20echo%20{{servo.left.pin}}={{servo.left.speed.slow.fw}} %20%3E%20/dev/servoblaster%22")
        .when(bodyAs(String.class).isEqualToIgnoreCase(SERVO_SPEED_MODE_MEDIUM))
            .to("exec:sh?args=-c%20%22echo%20{{servo.right.pin}}={{servo.right.speed.medium.fw}}%20%3E%20/dev/servoblaster"
                + "%20;%20echo%20{{servo.left.pin}}={{servo.left.speed.medium.fw}}%20%3E%20/dev/servoblaster%22")"
        .when(bodyAs(String.class).isEqualToIgnoreCase(SERVO_SPEED_MODE_FAST))
            .to("exec:sh?args=-c%20%22echo%20{{servo.right.pin}}={{servo.right.speed.fast.fw}}%20%3E%20/dev/servoblaster"
                + "%20;%20echo%20{{servo.left.pin}}={{servo.left.speed.fast.fw}}%20%3E%20/dev/servoblaster%22")
    .end().to("log:xyz.sergevas.iot.bcbot?level=DEBUG&showHeaders=true");

Feasibility Test

And here is a small test that can be considered successful :). The efforts of my six-month son encouraged me to continue my work.