The four V’s of big data include velocity, volume, variety, and veracity. These are critical elements that help in establishing an understanding of big data as well as some of the significant components and aspects of big data. In building a cloud strategy that is effective and tailored to meet its customer needs, developers often prefer using the four V’s in their analysis to help in establishing an understanding as to the capacity as well as some of the critical variables that need to be put into practice. Most people perceive big data with its volume, velocity, variety, and veracity as the perceived value of the data. On analyzing the four V’s, there is usually plenty of information some of which are loosely defined. On ensuring the quality of the data in transit, there is a need for providing an accurate definition as well as an analysis of the data. The exploration of big data is about the establishment of correlations between the things one needs to know as well as the real possibilities.
Velocity is the speed at which data is generated, refreshed or even created. Volume, on the other hand, is one of the best-known characteristic of big data as the volume is usually measured using the amount of data. Veracity refers to provenance or even the reliability of the data sources, the context as well as the meaningfulness of the analysis based on the data that exists.
A comparison between the four V’s and ten V’s is that in the ten V’s, there is an aspect of security that has been included while in the four V’s, the element of safety has not been considered in detail. However, all the V’s refer to the characteristics of big data that have to be achieved in a data strategy. With an analysis into the models that have been presented by Firican, velocity, variety, volume, variability, veracity, validity, vulnerability, volatility, visualization and value play a critical role in enhancing a data strategy as it will help ensure that some of the risks that are involved in handling data are effectively managed. A data defense strategy often requires that all the critical components of the data are managed and handled effectively with the substantial risk that is exposed to the data is significantly reduced.
Data governance is overall management of the integrity, availability, usability as well as the security of the data that is being used in an organization. An effective data governance program need to include a governing council, defined procedures as well as the plan that will be used in the execution of data strategy as well as the procedures. Context problem in data governance refers to the inability of an organization to appropriately define and enhance the usability and reliability of the data. Some of the key goals of a data governance strategy include but not limited to the following:
Some of the guidelines that address a data governance program is a strategic, tactical as well as an operational guideline which will be used by the stakeholders as a basis upon which analysis and development will be based upon (Marr, 2017). So as to effectively and efficiently organize as well as use the data in the company’s context, data governance is a process that needs to be ongoing as it is only through this that the reliability of the data that have been produced by companies will be determined.
Chisholm examined some of the stages of the data life cycle which are followed when developing a life cycle for the data (Firican, 2017). Data life cycle is very critical as it defines some of the stages in which data goes through as it is being processed in an organization.
The initial stage that data that enters an enterprise has to pass through is in the firewalls after being captured. Data capture is defined as the act of creation of the data values which do not exist and have never existed in an organization. Three of the ways that data could be captured include:
The data governance activity that is involved at data capture is validation of the data to ensure that the data does not contain any malware which could affect and infect the other existing systems in the organization.
This is the sixth stage of the data life cycle. Publication of the data involves communicating of the data to outside persons or even shareholders of an organization. Data publication is often done after including all the critical systems as well as ensuring the authenticity, the reliability as well as the accuracy of the data (Marr, 2017). These are all components that will enhance the quality of the data that is published by an enterprise. Data governance come into play with data publication as it ensures that the data that is published is accurate and not misleading to the parties to which it is intended (Firican, 2017).
Data stewards in an organization have a role of utilizing the data governance process of the organization so as to ensure that there is the fitness of the key data elements which includes both the content as well as metadata. They often share some of the responsibilities with the data custodian. A data governance program in an organization need to have an individual to champion the prerequisites as well as ensuring that the data that is produced meets the quality thresholds. The following is an in-depth explanation of some of the roles of a data steward:
This is among one of the key duties of the data stewards. They have a role of ensuring that they oversee the life cycle of a given set of data. They are particularly responsible for the definition as well as the implementation of policies and the procedures which ensures the day-to-day operational as well as the administrative management of the systems as well as data in an enterprise. This includes the intake, processing, storage as well as the transmission of data to the external and the internal systems (Hebbar, 2017).
Data stewards are responsible for the establishment of the quality of data, requirements as well as the data metrics which includes the definition of the data values, ranges as well as some of the parameters which are essential for each of the data elements. Data stewards therefore always engage in an ongoing as well as a detailed evaluation of the quality of data (Fleckenstein & Fellows, 2018).
The protection of the data is among some of the challenging aspects in data stewardship. They should establish a set of protocols, guidelines and procedures which govern the data against proliferation so as to ensure that the privacy controls have been effected in the process (Hebbar, 2017). So as to be effective, the data stewards should thus compile the retention, archival as well as disposal prerequisites which ensures compliance with the regulations and the policies of the organization.
Data stewards define the policies as well as procedures that are used in accessing data which includes the criteria for the authorization. They work closely with the data custodians in ensuring that they establish controls, evaluate a suspected breach or even a vulnerability in confidentiality and consequently report to the management or the personnel tasked with the information security (Marr, 2017). Privacy, security and risk management is the role that defines the data offense. In ensuring that appropriate risk management strategies are developed, the data steward basically prevents the system from any data offense (Fleckenstein & Fellows, 2018). On the other hand, the data strategy that focuses on data defense is the implementation of policies and procedures aimed at ensuring that the data is safe and protected from any vulnerabilities.
Within the data life cycle quality could be achieved in almost all the stages provides that all the parties ensure that they adhere to the data requirements and the need to ensure that the data is accurate. Data maintenance is one of the stages within the DLC in which quality could be planned and achieved sustainably. Data stewards are responsible for ensuring that data maintenance has been achieved with all the sustainability requirements. Once data has been captures it has to be maintained. It can be defined as the supplying of data or given sets of data up to a point in which the synthesis of data as well as usage occur. Data maintenance is the processing of the data without deriving the value from the enterprise.
Some of the quality activities that can be accomplished include accuracy of data, reliability of data as well as ensuring that the data processing has been done in an effective manner. Data publication is also another process which needs to synthesize quality. There is need for an organization to ensure that the data they publish is accurate and meets the threshold that is required for ensuring the data meets the user needs. With the help of data steward and data custodians, data accuracy and reliability is likely to be achieved which thus ensures that the quality of the information is realized.
There exist a lot of information that have been presented in class to date. In relation to data management, it is very important that all the critical stakeholders ensure that they spearhead the issue of data governance as well as managing organizational data. Big data strategy is therefore always used in data management. Issues such as the value of data, volume, veracity, velocity as well as validity of the data need to be taken seriously especially when developing and crafting a data strategy as this will be seen as crucial in addressing some of the data issues that organizations face today. In the systems development life cycle, it is crucial to develop a strategy that is in line with all the stages and cycles. I have also learnt that data management techniques are crucial and critical to the business process development as well as in understanding various business models. Data management strategy will help solve all the risks and vulnerabilities that face the information systems
References
Firican, G. (2017, 8 Feb). “The 10 Vs of Big Data”. [web log comment]. Retrieved from: https://tdwi.org/articles/2017/02/08/10-vs-of-big-data.aspx .
Fleckenstein, M., & Fellows, L. (2018). Modern data strategy
Hebbar, P. (2017, 26 Sept) “Who is a Data Steward and What are His Roles and Responsibilities”. Analytics India. Retrieved from: https://www.analyticsindiamag.com/data-steward-roles-responsibilities/
Marr, B. (2017). Data Strategy: How to Profit from a World of Big Data, Analytics and the Internet of Things.
SAS Impacts Staff. (n.d.). “The importance of data quality: A sustainable approach”. White Paper. Retrieved from: https://www.sas.com/en_us/insights/articles/data-management/importance-of-data-quality-a-sustainable-approach.html .
Do you need high quality Custom Essay Writing Services?