Machine Learning Based Activity Learning for Behavioral Contexts in Internet of Things (IoT)

Ontology based activity learning models play a vital role in diverse fields of Internet of Things (IoT) such as smart homes, smart hospitals or smart communities etc. The prevalent challenges with ontological models are their static nature and inability of self-evolution. The models cannot be completed at once and smart home inhabitants cannot be restricted to limit their activities. Also, inhabitants are not predictable in nature and may perform “Activities of Daily Life (ADL)” not listed in ontological model. This gives rise to the need of developing an integrated framework based on unified conceptual backbone (i.e. activity ontologies), addressing the lifecycle of activity recognition and producing behavioral models according to inhabitant’s routine. In this paper, an ontology evolution process has been proposed that learns particular activities from existing set of activities in daily life (ADL). It learns new activities that have not been identified by the recognition model, adds new properties with existing activities and learns inhabitant’s newest behavior of performing activities through Artificial Neural Network (ANN). The better degree of true positivity is evidence of activity recognition with detection of noisy sensor data. Effectiveness of proposed approach is evident from improved rate of activity learning, activity detection and ontology evolution.


INTRODUCTION
This era of technological revolutions is human centric since technology-adoption targets to facilitate humans. One of the promising applications of these technologies is Internet of Things (IoT). In alignment to such trends, human activity recognition (AR) and activity learning (AL) has become a natural enabler to adaptive technologies of Internet of Things (IoT). AR/AL has become a research focus in domains such as pervasive and mobile computing [1], ambient assisted living [2], social robotics [3], surveillance-based security [4] and context-aware computing [5].
Activity learning is based on activity recognition in different domains and environments. In order to recognize an activity, different types of sensors are installed in human residences; so that their behaviors, actions and environment changes can be captured, modeled and analyzed.
Activity recognition is based on ontological models. In order to create an ontological model, all the information and data is to be analyzed, processed and uses ontological models for activity recognition. Further the activity learning (that complements AR process) depends upon results obtained from activity recognition. There are two main types of AR techniques laying the foundation for activity learning: (a) Data Driven and (b) Knowledge Driven.
Data Driven Techniques: use sensor data to develop activity models through machine learning algorithms and data mining techniques. These techniques have ability to handle noise, uncertainty, and incompleteness in sensor data stream but suffer the issues of data scarcity and non scalability.
Knowledge Driven Techniques: use domain knowledge in the required field, to generate models for user activities with the help of knowledge based techniques such as knowledge management and engineering. A major drawback in knowledge driven techniques is that only static activity models can be built [6]. Static activity model cannot be updated with respect to changes in inhabitant's actions, once developed. Another limitation of knowledge driven techniques is to develop an activity model. The reason is difficulty in developing a complete activity model with all possible use cases for all users in real life. Knowledge-driven activity recognition is based on the activities of daily life with a specific duration of time, location and space (as modeled in ontology).
The ontological models are handy in activity recognition but issue with ontological models is that they are static in nature and have inability of self-evolution. In order to update these ontological models, neural networks have been employed to learn different actions and perform an activity step by step.
In this paper, a continuous activity modeling process has been proposed where domain experts provide initial generic activity models using knowledge engineering tools [7]. These generic activity models are integrated to smart home environment. Activity logs are generated on the basis of activity recognition process. An intelligent technique (ANN) [8] is used for learning from activity log file to learn behavioral model of inhabitant. Such behavioral models are presented to the domain expert, who can add or update inhabitant's activity model in the knowledge base. This loop is continuously repeated to achieve dynamic and behavioral models based on initially provided generic models and user generated data. A Generic activity learning process for having behavioral model (also called complete model) of a learner is provided in Fig. 1.
Behavioral Model: In order to have a complete activity model from generic models (provided by domain experts), learning the actions is a key process. For example, if we have an activity named "bathing"; it consists of actions such as "turnOnShower" and "hasSoap".
These are the minimum required actions for anyone to take a bath. Contrarily, another inhabitant may use shampoo or towel in this activity. The basis of proposed approach is to determine the minimum required actions (Generic model) to perform an activity, accumulate data generated by other inhabitants to learn about actions, and acquire behavioral modal of the same activity (with varying actions) as exemplified in Fig. 2.
As inhabitant's actions evolved, the learning system learns about new versions of first activity model hence that model evolves with inhabitant and can adopt to changing user behaviors. Hence the system learns about new versions of old activity. This behavior allows the experts to define generic activities on the higher levels of abstraction and let the system develop behavioral model to create specialized knowledge.
The rest of the paper is organized as follows: section 2 provides a brief review of activity recognition and activity learning techniques, section 3 describes the proposed framework for activity learning followed by results and evaluation in section 4. Section 5 concludes the work with potential future directions.

LITERATURE REVIEW
In this section, a brief research review of techniques on human activity learning and recognition has been provided. Later, recent techniques on activity learning have been analyzed from different aspects in addition to vision-based, learning-based models of sensors, image and video recognition and mobile recognition of activity sensors [32][33][34][35][36].
An extensive research has been carried out when the paradigm shifted from traditional systems to smartness of mobile devices [15], [16]. Modern technologies of wearable sensor includes smarts phone, smart rings, clothing sensor, healthcare sensor and dedicated sensor attached to different parts of the body. They observe the basic actions like motion, walking, running, bending, uplift arm, or focus of eyes. A series of actions make the model capable enough to predict the activity like making tea, prepare breakfast.
In [9], data driven AR is combines with knowledge based methods to handle uncertainty of sensors. A knowledge-based approach for a concurrent AR has been presented by Ye in [10]. This approach explores the context of sensor activation and uses context dissimilarity to cluster a continuous sensor sequence into chunks. Each cluster corresponds to one "under process" activity. It exploits the Pyramid Match Kernel (PMK) approach, augmented with a WordNet matching on hierarchical concepts in order to recognize activities using the domain ontology from a potentially noisy sensor sequence.
Moreover, a dynamic window approach uses segmenting of sensor stream. Riboni et al. [11], proposed a framework for representation of sensors, devices, activities and atomic actions. Their approach demonstrates a method for combining DL with probabilistic thinking. In any case, the likelihood esteems used are not established in the semantics and were given names manually. Also, the approach models concentration without considering portions of relations. Moreover, the proposed approach is static in nature with DL rules. The model orders the actions in the event that it may not characterize the DL. This work recognizes the activities occurrence but lacks the capacity to recognize the personalized behavior of the inhabitants.
Okeyo et al. [12] combine ontological and temporal knowledge formalisms to provide a representation for composite activity modeling. This paper also describes the entailment rules so as to dynamically infer the composite activities. Simple activities modeled in this paper are static in nature. Our work is distinguished from [12] in two aspects. Firstly, it generates a complete activity model over the foundation of a generic model. Secondly, it recognizes the activity intervals dynamically.
Azkune in [13] exploit the contextual knowledge in data-driven technique to recognize the personalized   [13] recognizes only the sequential activities lacking ability to recognize the parallel activities. The temporal information has not been catered which is an integral part of activity recognition rather "duration", "location" and "activity-type" properties have been used to recognize the activities. A dynamic window size to segment the sensor stream is employed by [14] and [25]. They introduce the mechanism of real time continuous activity recognition using ontological knowledge for sequential activities.
Supervised learning uses labeled data to train an algorithm that can systemize the unlabeled data [23]. In activity learning, supervised approach has some features such as: Data representation, transformation from multiple data sources, division of data into training sets, test set for learning model. Table 1 illustrates a comparison summary of promising hybrid AR techniques for parallel and interleaved activities along with modeling technique proposed in our research.

PROPOSED APPROACH
The proposed framework is detailed here that learns behavioral model from generic model as as illustrated in Fig. 3. Ontology based activity modeling is provided with identified constraints, asserting that complete and generic activity models are not possible simultaneously.
Once inhabitant's information is available from sources (from sensors in our case), next step is to represent this knowledge. Depending on the information collected, two types of activity models were developed. i.e. Generic Model (Ontological model) and Behavioral Model (JSON File).

Generic Model (Ontological Model)
Generic activity models can be represented through ontology concepts as triples (i.e. subject, predicate and object). For example, brushingTeeth is an activity that is performed twice a day, in the morning and before bedtime. In general, it involves use of a toothbrush and water. This is generally called a frame of the activity. Activities are defined as ontological ideas and all actions that are essential to complete the activity as the characteristics of the concept.

Behavioral Model (JSON File)
A more specialized behavioral model (from generic activity model) was developed after the process of activity learning. These behavioral models evolve throughout the process of activity learning. The generic models are represented through ontology files whereas behavioral models in the form of JSON file. Behavioral models are specialized cases of generic models for example "Bathing" as illustrated in Fig. 2. In order to incorporate this new activity with added/different action "Bathing" patterns into domain knowledge, a behavioral models indicating specialized behavioral traits for a specific activity needs to be developed. So it can be concluded that Generic models are characterised by a sequence of necessary actions to perform an activity plus a duration estimation and behavioral models are extended form of generic model.
In order to recognise an activity a two-step algorithm has been designed and implemented The first step, named sensor action mapping step, uses sensor information from the context knowledge to transform sensor activations into actions. The second step, activity finding step, runs a pattern recognition algorithm using generic and behavioral models. The process is explained in Listing 3.1.

Sensor Action Mapping
Inputs from sensor activation dataset are received and transformed into domain knowledge after every sensor activation. For each sensor activation in the dataset, the corresponding sensor model is checked in domain knowledge. Every sensor has the action to which it has to be mapped in domain knowledge files provided by the domain expert. If we consider bathing sensors as input

Activity Finding
The objective of this step is to find all valid occurrences of domain knowledge in the action pattern obtained from the sensor action mapping step. An iterative process is run over actions. Firstly, the algorithm checks whether the current action relates to any of the behavioral model, stored in the JSON file. If the answer is no, it will check in generic model. Here it is worth mentioning that if action relates to more than one generic or behavioral models, all the possibilities are treated. For example, if an action sensor of "cup" is considered, it is used in activities of MakeTea as well as in MakeCoffee. In activity finding step each action is stored in sensor stream file (SSF) which assists in for behavioral learning.

Activity Learner
The purpose of this step is to describe and analyze the proposed learning algorithm for specialized and behavioral activity models which is called Activity Learner (AL) as shown in Fig. 4.
AL uses the results obtained by the Sensor Stream File (SSF), where different action sequences for every activity are identified in activity recognition process. The aim of AL is to learn behavioral activity model, from the information given by the SSF. Artificial Neural Network (ANN) is used for learning. Let's (S1, S2, S3, S4 …. Sn) are Sensors input and M is the total No. of sensor in activity log file is: Let's (P1, P2, P3, P4 …. Pn) are the generalized activities which are mapped on hidden layer. N is the total No. of generalized activities.

(3.2)
And weight on each edge is.
(3.3) Where l is the location, t is time and at is the activity type. The value of time t will be 0 and 1 and calculated by: (3.4) In above equation t1 is activity start time, t2 is activity end time and t is sensor activation time. If the value of t exist between t1 and t2 than its value will be 1 else it will be 0. y is the output and will be calculated as (3.5) In equation 3.5 l is the location of input sensor, Pl is the location of performed activity, at is the activity Where l location t timeand at activity type at y t Pl P at type of input sensor, P(at) activity type of performed activity and t is already described in equation There is also an output table which holds the output record of all input sensor calculations.
Positive sensor noise can be calculated on output layer by: T is Target output for every sensor. In our scenario its value is 3. And y is output of every sensor.
After the calculation on all hidden and output layer, the output table will be populated with calculated results. This output table has all occurrences of optional sensors and also noise sensors with their respective generic models and behavioral models.

RESULTS AND EVALUATION
Activity of Daily Life (ADL) means a set of actions that are performed by any individual in a daily routine. Some instances for ADLs are making tea, making pasta or washing clothes. NN algorithm was applied over the dataset from SSF. All the optional sensor rows mapped on input layer and annotated rows mapped on hidden layer.
One of the most challenging aspects was to have a comprehensive dataset covering all the scenarios discussed and could provide a baseline repository for evaluation of proposed model. A synthetic dataset generator was used for stated purpose/survey. A handful of public inhabitants (household women, servants and hostel students) participated in a survey of object actions (generic activity model actions + optional actions). The activity patterns, activity time duration and location were acquired and analysed from the surveys. It assisted us in building "ground truth" datasets with all possible patterns of necessary and optional actions. Figure 5 illustrates the snapshot of our proposed synthetic data generator tools. The logic behind the dataset generation is the algorithm applied on Ground Truth Dataset and Domain Ontology, has been concluded below: Activities of two months have been simulated for the objective of getting reliable results (as shown in Table 2). There are 1258 sensor activations in sensor activation dataset that describing a total number of 294 activity instances. The results are shown in table 2. The percentage of true positives, false positives and false negatives are given with total number of occurrence of activities.
If the number of occurrence of optional sensors with generic activity model is greater or equal to 70% then that sensor is encoded in behavioral model (Json) else it is treated as noise and discarded.
In activity finding step (section 3.2), if a sensor activation mapped to one of the generic models fails, proposed system fails to detect that activity. In order to test this scenario, two of the seven activities were selected. Sensor activations mapped to generic model of activities are given an increased missing probability. In order to see how the missing noise affects the detection of those activities.
The rest of the activities are kept noiseless to measure the effect of missing noise in other activities. Table 3 shows the results for the sensor missing noise scenario. The activities which suffer missing noise are marked with an asterisk (MakeTea and MakeCoffee). As it can be observed in Table 3 that performance of proposed system for all the rest of activities remains the same. It keeps on producing 100% of true posi-tives, while false positives and negatives are greatly reduced.
The reduction of true positives generates the appearance of false negatives, since those activities not detected are labeled as noise. Thus, they are considered false negatives. There is no effect in false positives, but the effect on true positives is clear, as expected.

CONCLUSION AND FUTURE WORK
Ontology based activity learning framework has been proposed in this paper. It is implemented by data annotation, activity recognition, ANN based activity learner and annotated sensor's dataset. As a result of activity learning process, behavioral activity models have been achieved for personalized activity models of inhabitants in smart homes.
Activity recognition and learning approach relies on the interaction of objects monitored by sensors. One of limitations in this approach is mutual exclusiveness of the generalized activities i.e. optional sensors can be part of more than one behavioral activity.  We look forward to resolve this limitation through optional sensor as part of its exact generalized activity model.