Data Mining Implementation Process
Implementation Process for Data Mining
Implementation Process of Datamining
Cross-Industry Standard Process for Data Mining (CRISP-DM)
- This has six phases as a cyclical method as the given figure:
Cross-Industry Standard Process for Data Mining
- Understands the project goals and requirements form a business point of view.
- Converts the information to a data mining problem.
- Afterward a pre-plan designed to accomplish the target.
- Determine Data Mining Goals
- Produce a Project Plan
- Determine Business Objectives
- Access Situation
Determine Data Mining Goals
- The goal of predictive data mining is to supply a model which will be wont to perform tasks like classification, prediction or estimation, while the goal of descriptive data mining is to realize an understanding of the analysed system by uncovering patterns and relationships in large data sets.
Produce a Project Plan:
- Data Mining Project process is classified in two stages:
- Data preparation/data preprocessing and data mining.
- Data preparation process have Data Cleaning, Data Integration, Data Selection, and Data Transformation.
- The second step includes Data Mining, Pattern Evaluation, and Data Representation.
Determine Business Objectives
- The process of finding patterns , differences and relationships in large datasets which will be wont to make predictions about future trends.
- The most purpose of data mining is extracting valuable information from existing data.
- Requires a more detailed analysis of facts about all the resources, constraints, assumptions, and others that ought to be considered.
- Data understanding is that you simply have about the data, the requirements that the data will satisfy, its content and site .
- To be clear, it's far more than current location and a definition of what a data element means in place within an application or data base.
- Explore Data
- Collects Initial Data
- Verify Data quality
- Describe Data
- Refine the data mining objectives.
- Contribute or refine the information description, and quality reports.
- Feed into the transformation and other necessary information.
- Results of simple aggregation.
Collect Initial Data
- Lead to Original Data Preparation Steps.
- Data Loading if needed for data understanding.
- Information mentioned in the project resources.
- Several Information sources are acquired then integration is an extra issue.
Verify Data Quality
- Verifies the data quality and addressing questions.
- Describes the characteristics of the information obtained and also reports on the outcomes.
- It takes more time.
- Covers all operations to build the final data set from the original raw information.
- Several times, data preparation is probable to be done
- Select data
- Clean data
- Construct data
- Integrate data
- Format data
- It selects which information to be used for evaluation and also covers the selection of characteristics and the chooses the document in the table.
- It cleans the subsets of data, inserting appropriate defaults or more ambitious methods, such as estimating missing information by modeling.
- Generating Derived Characteristics
- Constructive Information Preparation
- Transformed Values of Current Characteristics
- Complete New Documents
- The methods whereby data is joined from various tables, or documents to create new documents or values.
- Formatting data mainly to language changes produced to information that does not alter their significance but may require a modeling tool.
- Data modeling refers to a group of processes during which multiple sets of data are combined and analyzed to uncover relationships or patterns.
- In data mining you look for valuable and relevant data to solve the marketing question.
- Evaluation Measures for Classification Problems. In data mining, classification involves the matter of predicting which category or class a new observation belongs in.
- The derived model (classifier) is predicated on the analysis of training data where each data is given a class label.
- The concept of deployment in data mining refers to the appliance of a model for prediction using a new data.
- The deployment phase are often as simple as generating a report or as complex as implementing a repeatable data mining process.
- Plan deployment
- Plan monitoring and maintenance
- Produce final report
- Review project
- Deploy the data mining outcomes into the business
- Takes the assessment results
- Concludes a technique for deployment.
- Refers to documentation of the process for later deployment.
Plan Monitoring and Maintenance
- It is important when the data mining results become a part of the day-to-day business and its environment.
- It helps to avoid unnecessarily long periods of misuse of data mining results.
- It needs an in depth analysis of the monitoring process.
Produce Final Report
- Final report are often involved by the project leader and his team.
- It may only be a summary of the project and its experience.
- It may be a final and comprehensive presentation of data mining.
- Review projects evaluate what went right and what went wrong, what was done wrong, and what must be improved.