Mine your organisation’s data using our data mining workbench tool, Authority Miner®, to provide consistent, reliable and timely operational intelligence. You can combine disparate data sets within a single work process from a range of sources including Oracle, SQL Server, Microsoft Access, csv, txt, pdf and Microsoft Word documents.
There are many definitions of data mining but the one that we prefer is:
To automatically interrogate one or more data sets with a view to providing actionable information that will save time, prevent and reduce crime, deter offending and enhance operational effectiveness in today’s Policing.
Data mining, as with all analytical processes, requires a structured methodology and the one we recommend is the Cross Industry Standard Process for Data Mining (CRISP-DM).
The process, as illustrated above, is cyclic and has six sections:
- Business understanding – In this section the business objectives and the data mining goal are determined together with the success criteria. A situational assessment is undertaken to cover the resource requirements, constraints, risks and contingencies and cost benefits. A project plan can be produced in this stage if required.
- Data understanding – The initial data set(s) are collected, described and documented. The data is preliminarily explored and its quality assessed.
- Data preparation – referring to the 80/20 rule, 80% of the project’s time is consumed in this section. The data is cleansed, merged, reformatted, new attributes are derived and refined ready for the next stage.
- Modelling – There are a number of modelling techniques that can be used many of which can be used in combination. Chapter 4 of the book, Data Mining Practical Machine Learning Tools and Techniques (Witten I H, Frank E, Hall M A (2011), Data Mining Practical Machine Learning Tools and Techniques, Morgan Kaufman, MA, USA.) and the paper Top 10 algorithms in data mining (Wu, V. Kumar, J. R. Quinlan, J. Ghosh, Q. Yang, H. Motoda, G. J. McLachlan, A. Ng, B. Liu, P. S. Yu, Z.-H. Zhou, M. Steinbach, D. J. Hand, and D. Steinberg. Top 10 algorithms in data mining. Knowledge and Information Systems, 14(1):1–37, 2008.), both provide a very good overview of a large number of modelling techniques. In order to model effectively the data will need to be prepared into training set(s), testing set(s) and holdback set(s) and assessed by using cross-validation techniques. Also within this section, modelling can take the form of process creation. An Authority Miner® process imports a set of data, manipulates and transform it through a series of nodes before providing an optimal operational solution.
- Evaluation – This section embellishes on the model assessment to evaluate the results of the data mining process according to the success criteria in Section 1, Business Understanding, above. The results of this analysis will provide a list of possible actions and decisions. The evaluation will take two forms:
- Evaluate the model against current data to ensure that it performs as expected.
- Evaluate the model within the terms of the business. For example:
- Are crimes reduced?
- Are the correct offenders displayed?
- Are the criminal networks relevant?
- Deployment – A deployment plan, if required, will be produced to bring this new data mining process into the organisation. Once properly deployed the whole CRISP-DM process may require re-running to refine the model and increase productivity.
For further information regarding CRISP_DM you can visit https://en.wikipedia.org/wiki/Cross_Industry_Standard_Process_for_Data_Mining.