Data Preprocessing

Discussion 1:In today’s world, data is being generated from various sources and in various formats; as the internet utilization is drastically increasing from different devices like sensors, cc cameras, laptops, workstations, tablets and iPad’s; the data available from internet is in unstructured formats and available in the form of text files, pdf files, images, videos, tweets and other formats (García, Luengo & Herrera, 2015). The collected is not normalized, clean, availability of incomplete data, de-normalized and unprocessed data. Using direct raw or unprocessed data produced false results and it is not useful for analytics.To process the data and used for the analytics, the quality of data is based on the three factors like accuracy, completeness, and consistency. Initially the data need to be accurate where the inaccuracy causes by human enters random data or chance of entering error data so incorrect and duplication of data causes inaccuracy in data processing. The other factor make sure is completeness where the incomplete data caused by data unavailability, and deleting consistent data. The third factor is consistency, to process the data in order to produce the analytical results maintaining the consistent data is one of the key factors.To perform various analysis where using processed data helps in generating various graphs and tables in decision making. The four stages that include preprocessing the data are data cleaning, data integration, data reduction and data transformation (Kamiran, & Calders, 2012). The first stage data cleaning involves identifying the missing values and eliminating noisy data. In order to remove noisy data different techniques used are binning, regression and outlier analysis. The second stage is data integration- data is being collected from various sources it is necessary to integrate the data to identify the related or correlated data. The third stage is data reduction- using different techniques data reduction helps in eliminating the duplicate data and reduces large volumes of data. Final stage is data transformation- data transformation helps in forming appropriate data in performing various algorithms and analytic techniques.ReferencesGarcía, S., Luengo, J., & Herrera, F. (2015). Data preprocessing in data mining (pp. 195-243). Cham, Switzerland: Springer International Publishing.Kamiran, F., & Calders, T. (2012). Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems, 33(1), 1-33.Discussion 2:Why are the original/raw data not readily usable by analytics tasks?Raw data is usually dirty, inaccurate and misaligned. This means that it cannot be utilized in its raw format (Sharda et al., 2020). Moreover, raw data can be unstructured and overly complicated. This means that data analytics have to be performed to transform raw data into refined data (Sharda et al., 2020). Therefore, data analytics is a critical approach to transform raw data into refined data.What are the main data preprocessing steps?The process starts with data consolidation, which collects, selects and integrates data. It may involve filtering any unnecessary data before its adequately utilized. The next step data cleaning, which ensures that errors are removed from the data (Sharda et al., 2020). Moreover, in this step, data is usually imputed and eliminates any duplication of data. The third step, data transformation, involves standardization, where data is placed in a range between the smallest and largest data. Nevertheless, discretion involves the categorization of data into different classifications (Alasadi & Bhaya, 2017). In data transformation, there is the creation of different attributes of data. The last step in data preprocessing is data reduction, which ensures reduced dimension, reduced volume and balanced data (Alasadi & Bhaya, 2017). The last step ensures that there is no too much data, which may be challenging to handle.List and explain their importance in analytics.Data consolidation, the first step, is essential because it allows for data collection, selection and integration. In this step, all the unnecessary data is usually eliminated to ensure that only appropriate data is available (Losarwar, V., & Joshi, 2012). In data cleaning, data scrubbing is vital because it ensures that all the data with errors is removed. Moreover, the step ensures that there is a reduction in duplication, removing data redundancy. Data transformation enables easier categorization of data (Alasadi & Bhaya, 2017). This is important because when data is organized into categories, it can efficiently be utilized, which would be impossible when data is unstructured (Sharda et al., 2020). Data reduction enables data balancing to ensure that some of the data is not over or under-sampled. Therefore, the process of preprocessing is necessary for data analytics.

Struggling to find relevant content? Order a custom essay on
Data Preprocessing
Let our experts save you the hassle
Order Now
Calculate the price
Make an order in advance and get the best price
Pages (550 words)
$0.00
*Price with a welcome 15% discount applied.
Pro tip: If you want to save more money and pay the lowest price, you need to set a more extended deadline.
We know how difficult it is to be a student these days. That's why our prices are one of the most affordable on the market, and there are no hidden fees.

Instead, we offer bonuses, discounts, and free services to make your experience outstanding.
Sign up, place your order, and leave the rest to our professional paper writers in less than 2 minutes.
step 1
Upload assignment instructions
Fill out the order form and provide paper details. You can even attach screenshots or add additional instructions later. If something is not clear or missing, the writer will contact you for clarification.
s
Get personalized services with GPA Fix
One writer for all your papers
You can select one writer for all your papers. This option enhances the consistency in the quality of your assignments. Select your preferred writer from the list of writers who have handledf your previous assignments
Same paper from different writers
Are you ordering the same assignment for a friend? You can get the same paper from different writers. The goal is to produce 100% unique and original papers
Copy of sources used
Our homework writers will provide you with copies of sources used on your request. Just add the option when plaing your order
What our partners say about us
We appreciate every review and are always looking for ways to grow. See what other students think about our do my paper service.
English 101
The paper was late. However, it was excellent quality.
Customer 452561, July 2nd, 2021
Nursing
Thank you!!!
Customer 452557, June 26th, 2021
Philosophy
The paper is great. Will definitely use again.
Customer 452773, May 24th, 2022
Nursing
Everything was done perfectly. Thank you.
Customer 452707, June 15th, 2022
Nursing
Looks good. Thank you!!
Customer 452525, April 27th, 2022
Human Resources Management (HRM)
Thanks for your assistance and promptness.
Customer 452701, November 1st, 2022
Other
Great Work!
Customer 452587, March 10th, 2022
Other
AWESOME
Customer 452813, June 21st, 2022
Human Resources Management (HRM)
Thank you
Customer 452531, May 19th, 2021
Nursing
Amazing work! I passed the assignment!
Customer 452707, August 20th, 2022
Nursing
Excellent PowerPoint! Thank you!
Customer 452707, June 29th, 2022
Social Work and Human Services
Great Work!
Customer 452587, August 31st, 2021
OUR GIFT TO YOU
15% OFF your first order
Use a coupon FIRST15 and enjoy expert help with any task at the most affordable price.
Claim my 15% OFF Order in Chat
error: Content is protected !!

Save More. Score Better. Use coupon code SPECIAL for a 15%discount