Cloud-based CEP on Healthcare @ Decision Camp 2014

Hello everyone,

In this opportunity I’d like to share a talk I gave at Decision Camp 2014 about Cloud-based Complex Event Processing implementations, along with lessons learned from implementing Complex Event Processing solutions on Healthcare. Hope you all enjoy it!

Cheers,

jBPM6 Developer Guide coming out soon!

Hello everyone. This post is just to let you know that jBPM6 Developer Guide is about to get published, and you can pre-order it from here and get from a 20% to a 37% discount on your order! With this book, you can learn how to:

  • Model and implement different business processes using the BPMN2 standard notation
  • Understand how and when to use the different tools provided by the JBoss Business Process Management (BPM) platform
  • Learn how to model complex business scenarios and environments through a step-by-step approach

Here you can find a list of what you will find in each chapter:

Chapter 1, Why Do We Need Business Process Management?, introduces the BPM discipline. This chapter will provide the basis for the rest of the book, by providing an understanding of why and how the jBPM6 project has been designed, and the path its evolution will follow.

Chapter 2, BPM Systems Structure, goes in depth into understanding what the main pieces and components inside a Business Process Management System (BPMS) are. This chapter introduces the concept of BPMS as the natural follow up of an understanding of the BPM discipline. The reader will find a deep and technical explanation about how a BPM system core can be built from scratch and how it will interact with the rest of the components in the BPMS infrastructure. This chapter also describes the intimate relationship between the Drools and jBPM projects, which is one of the key advantages of jBPM6 in comparison with all the other BPMSs, as well as existing methodologies where a BPMS connects with other systems.

Chapter 3, Using BPMN 2.0 to Model Business Scenarios, covers the main constructs used to model our business processes, guiding the reader through an example that illustrates the most useful modeling patterns. The BPMN 2.0 specification has become the de facto standard for modeling executable business processes since it was released in early 2011, and is recommended to any BPM implementation, even outside the scope of jBPM6.
Chapter 4, Understanding the Knowledge Is Everything Workbench, takes a look into the tooling provided by the jBPM6 project, which will enable the reader to both define new processes and configure a runtime to execute those processes. The overall architecture of the tooling provided will be covered as well in this chapter.

Chapter 5, Creating a Process Project in the KIE Workbench, dives into the required steps to create a process definition with the existing tooling, as well as to test it and run it. The BPMN 2.0 specification will be put into practice as the reader creates an executable process and a compiled project where the runtime specifications will be defined.

Chapter 6, Human Interactions, covers in depth the Human Task component inside jBPM6. A big feature of BPMS is the capability to coordinate human and system interactions. It also describes how the existing tooling builds a user interface using the concepts of task lists and task forms, exposing the end users involved in the execution of multiple process definitions’ tasks to a common interface.

Chapter 7, Defining Your Environment with the Runtime Manager, covers the different strategies provided to configure an environment to run our processes. The reader will see the configurations for connecting external systems, human task components, persistence strategies and the relation a specific process execution will have with an environment, as well as methods to define their own custom runtime configuration.

Chapter 8, Implementing Persistence and Transactions, covers the shared mechanisms between the Drools and jBPM projects used to store information and define transaction boundaries. When we want to support processes that coordinate systems and people over long periods of time, we need to understand how the process information can be persisted.

Chapter 9, Integration with other Knowledge Definitions, gives a brief introduction to the Drools Rule Engine. It is used to mix business processes with business rules, to define advanced and complex scenarios. Also, we cover Drools Fusion, and added feature of the Drools Rule Engine to add the ability of temporal reasoning, allowing business processes to be monitored, improved and covered by business scenarios that require temporal inferences.

Chapter 10, KIE Workbench Integration with External Systems, describes the ways in which the provided tooling can be extended with extra features, along with a description of all the different extension points provided by the API and exposed by the tooling. A set of good practices is described in order to give the reader a comprehensive way to deal with different scenarios a BPMS will likely face.

Appendix A, The UberFire Framework, goes into detail about the based utility framework used by the KIE Workbench to define its user interface. The reader will learn the structure and use of the framework, along with a demonstration that will enable the extension of any component in the workbench distribution you choose.

Hope you like it! Cheers,

Why Processes?

This means to be a sequel of my previous post, where we started discussing the benefits of writing rules. I intend to do a similar thing, but now centred on the concept of Business Processes. As with last time, this post is oriented to people just getting started with the concept of processes, who are still getting into it. Processes are a simpler concept to grasp if you compare it to rules, but I still think we should start with a basic description.

What is a process?

The simplest explanation of a process I can come up with is “a sequence of steps to achieve something”. Actually, I wasn’t able to come up with it, it is a dictionary definition. But it is as simple as that, just a sequence of actions that need to be taken, by either systems or people, to achieve a goal. If we talk about a business process, we just need to add a little more definition to it and say it is a sequence of steps on a specific domain, performed by systems or people, to achieve a specific business objective.

If you come from the developer end you might be thinking “so? any code I ever wrote is a sequence of steps towards something!”. And you’re right! Code itself follows a sequence. It also has a purpose. If it reads inputs from a specific source, it can interact with systems or people, depending if the source is a specific integration component or a user interface. There are, however, several advantages that process definitions provide over plain code. We’ll cover some of them in this post

Diagramming structure

The first and most user friendly feature is the possibility of seeing the process definition as a diagram. We won’t need to see the process as an obscure list of method invocations, but as an image like this:

If defined in the BPMN2 standard (the current most accepted standard for defining executable processes), all the diagramming information is available in the process definition itself, so the same information used to determine which is the next step to follow will also define the structure for visualizing the process. This is very helpful for business users who probably don’t know about a specific technology, but they have a very clear understanding of the what needs to be done at what time. Which leads to the next advantage.

Process definition as a collaborative tool

Once you have a diagram, non-technical people can understand each step of what is happening. Thanks to process editors provided by each implementation of BPM, people with no specific development background can also edit the content to match the real world scenario. This makes the process definition the centre of our development effort. It not only provides itself as a documentation thanks to the diagram, but it also works as a connection point between business analysts and technical people. Business Analysts need to worry only about the specifics of the process completion (that it should match the specific objectives, what order should the steps in the process be following, who should be in charge of what human task, etc), and technical people need to worry about integrating the process with any other existing service. Which leads to another advantage

Business users take care of business. Development takes care of integration

This leads to an eventual freedom from the development side, from having to worry about making each component to follow ever changing processes. Developers, in this paradigm, only need to worry about providing the necessary connectors to different systems and people. By connectors, we mean system invokers that can be called from the process, components that will be able to invoke or signal a process, and forms for people to be able to do human tasks. From a developer’s perspective, the process will be something like this:

Where you will only need to worry about connecting processes with other systems and providing a UI for new forms. In a way, it releases developers from having the complete responsibility of defining the structure of the system, allowing more people to collaborate in the definition of how things are done, and letting developers concentrate on connecting those definitions with other systems.

Scalability

Since development efforts on the process definition side will decrease over time, people in the development area can start working on what they really know to do best. Better connectors, and better environment configurations. Eventually, as process execution demand starts growing, you can start working on the scalability of your BPM systems. And since all activities move around the same structures (the process executions), you can concentrate on making ways for processes to take as little time executing steps as possible, managing signals in distributed ways, and managing responses from the connectors in a way that more than one node could be able to receive them.

This leads to not only better running systems, but also a more manageable scalability of them. As all core processes of the company start moving to BPM systems instead of legacy applications, high availability and scalability of systems starts becoming an ubiquitous feature.

A company as a process (composed of other processes)

The final stage of this endeavour is getting more and more processes of the company working inside a BPM system. Since processes can invoke other processes inside them (called sub-processes), you can make a hierarchy of components which can be as broad as desired, to cover as many operations of your company. An ideal point in this stage will be reaching the main process of the company, where usually client or service request are the input, and the objectives of the company are the output. Inside that process (which, beleive it or not, could be quite a simple diagram), you will invoke every other process you define for each specific case in a hierarchy of processes invocations.

Summing up

Processes become a great way to involve as many parts of the company in the definition of the steps needed to perform a task. Nobody knows better how to do a specific task than the operation teams, and nobody understands better the objectives of a company than the managers. Also, nobody understand better how to accomplish those tasks in an automated way than the developer teams. With process definitions, each one of them can collaborate on the same component, all working together the same goal: improving the objectives of a company.

Also, process engines should take care of two things: Coordinating tasks between other systems and people, and taking care of storing the state of current active processes. It is like this because it should let other systems with other objectives do what they do best. Web Services should be in charge of services. Rules should be in charge of complex decisions. In general, a good process engine should try to be something like the following description (which I like to read with De Niro’s voice in this scene of “Men of Honor” 🙂 ):

A process engine isn’t a monolithic system, it’s an integration platform. If it should be done by someone else, it invokes it. If it needs a person, it notifies him. If many tasks need to be done at the same time, it synchronizes them. If it is lucky, it will execute for a few milliseconds for a process that should run over a full year, for that is the closest it will ever get to being considered efficient.

And please stay tuned for the next post, where we will be discussing the power of putting rules and processes together!

Why Rules?

I’ve always gone into a lot of detail regarding specific points of the rule engine, but I feel this is a topic which needs to be covered more. So this post is oriented to people who might just be getting started with the concept of rules and are still in doubt. I will try to explain some of the advantages of using business rules and rule engines in a way that, oddly enough, I haven’t found online yet.

What is a rule?

If you got as far as reaching this blog, you probably already have an idea of what a rule is. The most primal explanation we can give is “something that looks like this”:

when a condition is found to be true
then a consequence is executed

And that’s about it. Add as much syntax sugar as you want on top of it, but a rule will still have that basic structure underneath it all. I still remember the first question that rose to the top of my head when they explained what a rule was: How is this any better than writing code? It took me a while to figure it out, but I’ll try to make it easier for you

Let’s begin by explaining the reason of such a simple structure. It’s based on an Artificial Intelligence principle, where a specific structure of code can be represented as data, in a structure similar to the one shown in the following diagram:

Why would we want to represent our code as data? Because manipulating that data makes a system able to change what it is doing, and it is one of the pillars of machine learning. Having the execution code as data means that we can change the data and make the system behave in a different way without having to restart any component. But that is not even the beginning of why rules are something useful:

It is not what one rule can do, but what many together can do

If we think of an extremely simple case, where maybe we have one or two rules to define a scenario, we can definitely think of a simpler way of doing the same scenario using nothing but Java. The power of rules doesn’t reside in the possibility of explaining simple scenarios, but quite the contrary. It lies on explaining complex scenarios in a very simple and performing way. It divides a problem in the most atomic components: each individual decision we could take, together with the minimal information we need to take that decision. No rule should extend beyond that complexity.

Rules don’t know each other

This is a principle most rule engines abide: Rules are independent. Each rule doesn’t know if another rule even exists. All they share is the information about the state of the world. This information is shared, and some rule engines (drools included) allows for rules to modify this information, in order for other rules to be activated. This process of activating other rules by changes in the information of the state of the world (called working memory) is called inference.

Because of this mechanism, rules don’t need to be too complex to describe a complex scenario: a complex rule can be divided into simpler rules that infer parts of information of a complex scenario and feed it to the working memory, and other rules might depend on that data to get activated. Which leads to the next item regarding how to manage complex, ever-growing situations

Scenario becomes more complex? Just add more rules

This is how you should manage complex scenarios. You don’t have to worry about having a very complex list of commands to evaluate each condition on a complex decision. All you need is to keep creating atomic rules, until you have covered all the different cases. Rules become the epitome of decoupling software: Each rule should just worry about understanding a very small and basic concept from the information it can gather from the state of the world

Letting the rule engine do its job

Rules should remain that simple because it is the rule engine that will manage the complexity for us. Whenever we need to detect many similar conditions, the rule engine will create a performing way of evaluating those conditions as fast as possible, through the RETE (in drools 5) and PRHEAK (in drools 6) algorithms. They will do the heavy lifting. It’s something that we have seen before:

If you’re as old as I am, you might have been there when development teams started going from compiled code (like C or C++) to interpreted code (like Java). I didn’t like it that much at first: I needed more resources to run the programs I made, I had to learn new APIs, everything was a class now… it just wasn’t my cup of tea, at least until I got the hang of it. The thing was that, in order to write good Java code, I needed to follow very different standards from those to write good C code. I didn’t need to create fancy optimizations in our code because those optimizations were going to be done by the VM. And eventually, simpler code became more efficient than complex code hacks we brought with us from C, mainly because the VM optimizations were doing it so much better than us.

And in a sense, a rule engine does the same thing. It will take all our rules and transform them into something that can run in the most efficient way to find all the matches to our rules. In a system with many rules, the code to evaluate the conditions will be far quicker than the optimizations we might add to a batch processing on a code block.

And that is pretty similar when talking about rule engines and rules. Its like the VM and Java code all over again. Someone will be able to do a better job than you at optimizing simple scenarios, so we shouldn’t have to worry about optimizing the code so much as making it clearer.

Fitting more hands in a plate

There is a common saying in engineering: You can’t make 9 mother make one baby in a month. We use it as a clear example of how some tasks cannot be divided any more, and have specific dependencies between each other. The phrase is used to describe situations where you have coupled code, and even if we divide a problem into many different tasks, people will end up touching the same components (and experiencing the expected conflict) when trying to solve two different situations.

One of the greatest advantages of rules is how highly detached they are. This means more people can be editing the rules without stepping in each other toes. The simpler the rules, the less experience needed to edit them. And thanks to human language to rules mappings (like DSLs and Decision Tables), you can have non-technical people defining rules. This means more people can get involved in rule definition, and help extend a system that could be as complex as necessary, and do it in the simplest of ways.

 To be continued

 

Public Training @San Francisco 2014

Hello,

At Plugtree (http://www.plugtree.com) we’re currently working on a public training for Drools and jBPM in the San Francisco area for April 21-25. This is happening the next week after Red Hat Summit, so if you’re on the area, it is a great opportunity to take advantage of! Hope you can make it. If you’re interested in this event, please find more information at http://www.plugtree.com/public-training-san-francisco-2014/. We would love to see you there! Please contact us at training@plugtree.com if you have any doubts.

Cheers,

Mariano

Book Review: A Practical Guide to jBPM5

A Practical Guide for jBPM5Greetings, everyone. In this opportunity, I would like to introduce everyone to a soon to be published book for jBPM adopters. Based on the jBPM5 technology stack, A Practical Guide for jBPM5 provides step by step, very detailed descriptions of how to configure your web designer, creating your first processes in it, adding many types of specialized tasks, exporting processes to an Eclipse workspace, and customizing the runtime.

It is particularly thorough in the steps needed to create your own customized implementations of certain components of the jBPM runtime, including the Human Task Server communication components, jBPM timer implementations, and history logs.

One of the things that make this book an interesting read is that is goes into detail enough to help you create your own components inside the jBPM runtime, while at the same time keeping a very detailed guide that even coding begginers would be able to follow with ease.

Also, it is worth remarking that this book follows a very interesting path from its birth. Being a Kickstarter funded endeavour, the book author had not only placed a remarkable effort to make this book by just writing it, but also by gathering funds for it, and coordinating all collaborations needed to make the book possible. I think this deserves a greater congratulatory level. It is great to see that people all around the world are pushing open source projects like jBPM forward guided by their effort only.

If you wish to keep track of this book’s publishing, you can follow it on Facebook.

Cheers!

Predictive Analytics + Drools:

I’ve been looking into Decision Management over the last few days. It is an excellent topic I’ve been missing out for quite some time. A brief description of what I understood so far from it, is a set of methodologies to provide a closed cycle for knowledge management systems, from the point of knowledge discovery and formalization, to exposing it through a knowledge runtime and providing feedback to the knowledge discovery.

This last part, the feedback, is a component I’ve always wondered if was possible to build as part of the tooling that Drools provides. The idea behind it is having an environment where you can play with data from your production systems, finding all sorts of combinations that you weren’t paying attention to before, either because you planned them for later, or because you didn’t thought of that condition was appearing in your environment. This is not always something you want to do so lightly, specially when you’re handling private or sensitive information in your working memory, but after a certain amount of filtering, it is something that you would like to consider a good place to start searching how your environment is working in production.

You might want to do this in a separate environment, however, because of two things:

  • Performance issues: If you’re going to perform queries, groupings, or any heavy search engine related tasks, you don’t want them to slow down your environment. For Drools, this is a very important topic, because so many people use the rule engine because of its high performance, and you don’t want it to decrease by any reasons
  • Simulation generation: Another heavy task. Once you find the patterns that would identify your new cases, or improve your existing cases, you might want to do changes to your knowledge definitions to see how they behave with the existing data. If they work well, you will want to apply the new knowledge definitions to the real production system, but not before.

The branch that targets this analysis is called predictive analytics, because it goes further from just making realtime analysis of data, but also focuses on discovering trends of change in your production environments to be suggests new ways in which production data might be getting inside your environment in a few minutes, hours, months, and so on depending on how far back the production data involved in the discovery process goes.

All these analysis are usually possible because they are thought to be conducted from Big Data. This is the part where Drools puts a distance with Decision Management Software. Drools production data (the working memory) lives in memory. It can be persisted, but even if it is, it is just a serialized blob of information to restore the session elsewhere. So, even if we did these analytical tools for Drools, they would have to work on memory.

This got me thinking if these sort of analysis could be done on top of a Drools rule engine. We certainly do have some tools in DRL to provide us with analysis capabilities.

Analytics with rules and Queries

DRL Queries can give us insight on any specific working memory using the same search patterns that rules use. The one thing it would be able to do is provide us with easy grouping, but rules and specific facts for carrying grouping information could do this quite easily.

rule "Group by init"
when
then
insert(new GroupBy());
end

rule "Group by example"
when
p: Person(age > 16)
gb: GroupBy()
then
gb.sum("personAge", p.getAge());
end

Also, once a query is constructed, group by functionality could be introduced on the Java side (after all, everything is running in memory). But let’s delay the detailed analysis of that for a moment by stating that, for the moment, if we wanted to analyze a working memory to find uncovered cases, we would most likely be able to. Having that as a “for the moment” assumption, lets try to see how we could implement an in-memory analytical tool for Drools

Given that, lets see 3 different scenarios to start building some tooling for this. We will use a few diagrams with the same color coding: white is for already existing systems, yellow for easily built components, and red for components that would be hard to implement:

Case 1: Your own 100% developed environment

Let’s say that, for the simulation environment where we will run our analytics, we were going to build an entirely new system from scratch. Some things might be easy to create, some not so much. Usually when we do this, it is because we have an already existing application dedicated to this, and want to use it for analyzing our Drools environment, so we will assume the environment already exists. But what do we have to build for it?

Without a KMS

As you see, the first thing we will need is a way to send new information to Environment B. Session persistence could take care of this, but usually when we run complex working memories we don’t usually persist in order to gain maximum performance. In this scenario, we would use a single component of persistence: session serialization. Using any pluggable communication methods, we could create a communication between two environments to share the same session , even if that session is not persistent.

The one component that would be hard to implement, however, would be a runtime and UI where we can construct queries or rules to run simulations with the production data. This would involve creating query editors, rule editors, runtime components to perform those searchs in the copied working memory, and UIs to show them to the user. These components would be quite hard to build, not because of any intrinsic complexity, but mostly because they would have to be maintained by 3rd parties (A.K.A. YOU)

Case 2: Using the KIE Workbench functionality embedded in our application

Fortunately, there is an alternative to writing your own editors that would facilitate development a lot. That is, using the guvnor editors for rules to construct a query editor you could use from your application. That leaves environment B to worry only about having an execution environment for execution of simulation scenarios.

With a KMS

It also facilitates deploying, because if you use guvnors internal build and deploy functionality, all new knowledge definitions could be implemented in your production environment by using nothing else than the KieScanner, provided by the kie-ci dependency in Drools 6.

The one problem it would have is that it would have to embed existing editor inside your own applications. It is a very reasonable consideration most Drools tooling users want to be able to do, but for the moment, it is rather complex to integrate. Perhaps there is a way to have everything environment B has in a single place?

Case 3: Extending the KIE Workbench functionality for analytics

In this scenario, we would just extend existing functionality of existing workbenchs. A query editor would be easily built using the guided rule editor. A query executor and analyzer could be created from injectable components and UIs. Event sending could be used to trigger a simple yet powerful session replicator from a different environment. These, I think, would be the best way to go to start building analytical tools on top of Drools right now:
Everything inside Kie WB

Conclusion

Of course, these are just theoretical components for the moment, but they would be very possible to implement. They would provide a huge added value to the discovery stages of knowledge development for Drools tooling. Development over the next few months could prove me wrong, but I hope it will not, as I also hope you found this analysis of drools analytic possibilities informative.

Cheers!

Book review: Drools JBoss Rules 5.X Developer’s Guide


I’ve been reading the new Drools JBoss Rules 5.X Developer’s Guide book, and I’m really pleased with it. It’s a very thorough compendium of Drools and jBPM introductory contents, up to the point of high performance tuning and application integration.
I specially liked the Decision Tables explanation, full of examples to understand how to configure them. It’s a pity how they had to cut the rule templates explanation, but since it is no longer part of Drools 6 supported APIs, I would say it was for the best.
For people wishing to get a preview of the book contents, chapter 6 discussing stateful knowledge sessions can be downloaded here.
When I reached the jBPM chapter, I was quite surprised. Explaining all of Drools concepts can be quite coumbersome, enough to fill a full book, and the same happens for jBPM concepts alone. I didn’t expect to see so much explanations of jBPM concepts on a Drools book, so it was a very pleasant plus! It helps you a lot in configuring process definitions using Eclipse editors.
It also teaches a lot about how to integrate the runtime in a full web application. These sort of examples have the great advantage of letting the reader see the components as they would run in the same configurations they can use later on in production environments. Most books, blogs, trainings and examples only provide junit tests that, even if they cover the full contents of the framework, don’t show them interacting with a full architecture. Having such a big portion of the book dedicated to this topic was both refreshing satisfying, as it was very well explained, especially the spring integration examples.
Integration chapters were very thorough. One thing though: I recommend reading chapter 11 before chapter 10, because they use the integration tools on the eleventh chapter to build the webapp in the tenth chapter.
The last piece that I found to be actually brilliant was the “Learning about Performance” chapter. It is one of the clearest descriptions I’ve seen of the Rete algorithm and how to make it work better for your particular domain. That chapter alone is something I would recommend even for people who has already learnt Drools and jBPM concepts.
In conclusion, I can see that this book tried to cover everything in just one book, and did the closest thing possible to reality; covering most of the topics in detail and giving an interesting introduction to the rest of the concepts. Drools 6 related books are around the corner, but this title will provide a very good introduction for people wishing to get started on Drools and I hope you get a chance to read it if you’re just starting on Drools or if you’ve been working on it for a while.
Cheers!

Drools and jBPM training London: Next week!

I’m really excited over next week Drools and jBPM training! We’ll be traveling today and before I go, I wanted to leave you this updated list of the contents of the course, with an additional UberFire part on the first day.

Places are limited but there’s still room, subscribe while you can!

Also, remember that on 23 and 24 at 3:00 PM we will have the free Drools and jBPM workshops for anyone who wishes to ask anything to kie core developers Mauricio Salatino and Michael Anstis!

I have to go now. Cheers!

Drools and jBPM Training London: exercise teaser

Greetings everyone! In this post I’ll be showing one of the exercises we will be playing with in the next Drools & jBPM Training in London, October 21-25.  There’s still time to register so go ahead!

This exercise shows a process interaction of something us developers and analysts know pretty well: managing requirements in a sprint. It’s something that we do everyday, so we don’t have to waste so much time explaining the domain, and we can get really fast to the process definitions and how to do each task.

It’s also a very good example to run through all the things related to a process execution:

  • Human tasks: Writing code, performing QA analysis, reporting bugs and fixing them.
  • Automated tasks:  Jenkins interactions in a continuous interaction environment, automatic deploys and tests, email notifications, all have a use in this small case
  • Process interactions: Each requirement in a sprint is a process by itself, and the sprint runs as a process too.
  • Rules execution tasks: We can use them to validate requirements, define initial priorities, and probably a lot more.

The processes look something like this for the requirements:

You can see that we have all the requirement life cycle defined in this process definition; when developers do it, when it has to be tested, what to do if bugs are found… and finally, the process instance is completed when no more bugs are found in the requirement implementation.

This one is for the sprint:

It’s a bit more cryptic, but the objective is simple. It starts and distributes all requirements priorities using rules, then starts each requirement in a process using a script task. The process will then finish when a signal is sent to the instance that either all requirements are completed or the sprint was manually closed.

We will show you a small test case where you can simulate all the steps of these processes, learn how to make them intercommunicate using different methods, how to run them using custom handlers, and also asynchronous executors, and we will have a lot of fun learning how to add new features to the processes and the runtime. You can download the example from here to play with it in the meantime!

Cheers,