Industry 4.0, or the fourth industrial revolution, that blurs the boundaries between the physical and the digital, is underpinned by vast amounts of data collected by sensors that monitor processes and components of smart factories that continuously communicate amongst one another and with the network hubs via the internet of things. Yet, collection of those vast amounts of data, which are inherently imperfect and burdened with uncertainties and noise, entails costs including hardware and software, data storage, processing, interpretation and integration into the decision-making process to name just the few main expenditures. This paper discusses a framework for rationalizing the adoption of (big) data collection for Industry 4.0. The pre-posterior Bayesian decision analysis is used to that end and industrial process evolution with time is conceptualized as a stochastic observable and controllable dynamical system. The chief underlying motivation is to be able to use the collected data in such a way as to derive the most benefit from them by trading off successfully the management of risks pertinent to failure of the monitored processes and/or its components against the cost of data collection, processing and interpretation. This enables formulation of optimization problems for data collection, e.g. for selecting the monitoring system type, topology and/or time of deployment. An illustrative example utilizing monitoring of the operation of an assembly line and optimizing the topology of a monitoring system is provided to illustrate the theoretical concepts.