Cluster Method of Description of Information System Data Model Based on Multidimensional Approach (часть 3)

We can identify two cases where such approach can be successfully used. The first one takes place when different subdivisions of the dimensions onto the layers occur during the analysis of different semantic components, and the second one – when there is a simple way of building a subset describing the SPMC redundantly, and the efficient way to describe the combinations which are to be excluded from this subset to reduce it to the SPMC. Let’s consider these cases in more detail.

In the first case, the decomposition of the observed phenomenon on l semantic components corresponds to the union of member combinations subsets:
SPMC(H) = Q1 ∪ Q2 ∪.. ∪ Ql.
Set of analytical space dimensions can be divided into layers in different ways due to differences in semantics of the observed phenomenon components:
D(H) = L1i ∪ L2i ∪.. ∪ Qmii,
there i = 1, …, l – number of component, and mi – the quantity of layers in i component. Each subset Qi is formed according to its split of set of the dimensions can into layers.

In the second case, set of possible member combinations is represented as the difference of two subsets:
SPMC(H) = R Q,
there R – set of member combinations, described with an excess (set to reduce), and Q – set of combinations to be excluded. Set to reduce may be formed using the following rules. It should include member combinations obtained by the Cartesian product of all members of all dimensions. It must be supplemented with a set of combinations that contain the special value “Not in use” for some dimensions, for which this value is acceptable. From this set it should be excluded those combinations which can be obtained by replacing the special value “Not in use” by the member. This approach can be used in case the set SPMC(H) has a complex structure and it may be offered a simple algorithm of forming a subset Q.

Содержание

4 Method of construction of set of possible member combinations
5 Conclusion
References

4 Method of construction of set of possible member combinations

We can propose the algorithm of SPMC description basing on the cluster approach and consisting of the following steps:

Allocate the n semantic components (n≥1) within the observed phenomenon and juxtapose these components with the subsets of combinations Qi, i = 1, …, n;
Construct a formula for SPMC(H) using Qi and operations of set theory according to the revealed relationships between the components of the observed phenomenon;
Form a subset of combinations for each Qi:
(a) perform the analysis of pairwise relations between the dimensions corresponding to Qi semantics, and form the groups of members expressing these relations;
(b) allocate the layers of dimensions in a set of dimensions and build the dimensions connectivity diagram for each layer;
(c) make the subdivision of the groups of members specified in layers according to the relations available from the diagrams of layers connectivity;
(d) realize the formation of clusters of member combinations and consolidation of these clusters in subsets of combinations for layers;
(e) execute the formation of a subset of Qi combinations by the Cartesian product of subsets of combinations for the dimensional layers;
Calculate the SPMC(H) using the constructed formula.

5 Conclusion

In case of the development of large multiple-aspect multidimensional information system the use of the cluster approach for describing the set of possible member combinations allows to provide the compactness while specifying the metadata and to express the semantics of the analyzed phenomenon observed. The proposed approach is based on the identification of relations between the dimensions which reflect the properties of the observed phenomenon, and on the formation of the groups of members which elements are united by the similar behavior towards these relations.

Acknowledments
The work is partially supported by the Ministry of Education and Science of the Russian Federation (the Agreement number 02.a03.21.0008).

References

Thomsen, E.: OLAP Solution: Building Multidimensional Information System. Willey Computer Publishing, New York (2002) ISBN 0-471-40030-0
Hirata, C.M., Lima, J.C.: Multidimensional cyclic graph approach: Representing a data cube without common sub-graphs. Information Sciences. 181, 2626–2655 (2011)
Karayannidis, N., Sellis, T., Kouvara, Y.: CUBE File: A File Structure for Hierarchically Clustered OLAP Cube. In: Advances in Database Technology, pp. 621–638, Springer-Verlag, Heidelberg (2004), ISBN 978-3-540-21200-3
Chun, S.: Partial Prefix Sum Method for Large Data Warehouses. In: Foundations of Intelligent Systems – ISMIS 2003, pp. 473–477. Springer-Verlag, Maebashi City (2004) ISBN 978-3-540-39592-8
Messaoud, R.B., Boussaid, O., Rabaseda, S.L.: A Multiple Correspondence Analysis to Organize Data Cube. In: Databases and Information Systems IV DB&IS 2006, pp. 133–146. IOS Press, Vilnius (2007) ISBN 978-1-58603-715-4
Jin, R., Vaidyanathan, J.K.; Yang, G., Agrawal, G.: Communication and memory optimal parallel data cube construction. In: IEEE Transactions on Parallel and Distributed Systems. 16, 1105–1119 (2005)
Luo, Z.W., Ling, T.W., Ang, C.H., Lee, S.Y., Cui, B.: Range Top/Bottom k Queries in OLAP Sparse Data Cubes. In: Database and Expert Systems Applications – DEXA01, pp. 678–687. Springer-Verlag, Heidelberg (2001) ISBN 978-3-540-42527-4
Fu, L.: Efficient Evaluation of Sparse Data Cubes. In: Advances in Web-Age Information Management – WAIM’04, pp. 336–345. Springer-Verlag, Heidelberg (2004) ISBN 978-3-540-27772-9
Chen, C., Feng, J., Xing, L.: Computation of Sparse Data Cubes with Constraints. In: Data Warehouse and Knowledge Disovery, pp. 14–23, Springer-Verlag, Prague (2003), ISBN 978-3-540-40807-9
Salmam, F.Z., Fakir, M., Errattahi, R.: Prediction in OLAP Data Cubes. Journal of Information & Knowledge Management. 15, 449–458 (2016)
Romero, O., Pedersen, T.B., Berlanga, R., Nebot, V., Aramburu, M.J., Simitsis, A.: Using Semantic Web Technologies for Exploratory OLAP: A Survey. IEEE Transactions on Knowledge & Data Engineering. 27, 571–588 (2015)
Gomez, L.I., Gomez, S.A., Vaisman, A.: A generic data model and query language for spatiotemporal OLAP cube analysis In: Proceedings of the 15-th International Conference on Extending Database Technology — EDBT 2012, pp. 300–311, Berlin (2012) ISBN: 978-1-4503-0790-1
Tsai, M.F., Chu, W.: A Multidimensional Aggregation Object (MAO) Framework for Computing Distributive Aggregations. In: Data Warehousing and Knowledge Discovery – DaWaK 2003, pp. 45–54. Springer-Verlag, Heidelberg (2003) ISBN 978-3-540-40807-9
Vitter, J.S., Wang, M.: Approximate computation of multidimensional aggregates of sparse data using wavelets, In: Proceedings of the 1999 International Conference on Management of Data – SIGMOD99, pp. 193–204. ACM, New York (1999), ISBN 1-58113-084-8
Leonhardi, B., Mitschang, B., Pulido, R., Sieb, C., Wurst, M.: Augmenting OLAP Exploration with Dynamic Advanced Analytics. In: Proceedings of the 13th International Conference on Extending Database Technology – EDBT 2010, pp. 687–692. ACM, New York (2010) ISBN 978-1-60558-945-9
Wang, W., Lu, H., Feng, J., Yu, J.X.: Condensed Cube: An Effective approach to reducing data cube size. In: Proceedings of the 18th International Conference on Data Engineering – ICDE02, pp. 155–165. IEEE Computer Society, Washington (2002) ISBN 0-7695-1531-2
Goil, S., Choudhary A.: Design and implementation of a scalable parallel system for multidimensional analysis and OLAP. In: Parallel and Distributed Processing – 11th IPPS/SPDP99, pp. 576–581. Springer-Verlag, Heidelberg (1999) ISBN 978-3-540-65831-3
Cuzzocrea, A.: OLAP Data Cube Compression Techniques: A Ten-Year-Long History. In: Future Generation Information Technology – FGIT 2010, pp. 751–754. Springer-Verlag, Heidelberg (2010) ISBN 978-3-642-17568-8
Hirata, C.M., Lima, J.C., Silva, R.R.: A Hybrid Memory Data Cube Approach for High Dimension Relations. In: Proceedings of the 17-th International Conference on Enterprise Information Systems – ICEIS 2015. Vol. 1, pp. 139–149. SciTePress, Barselona (2015), ISBN 978-989-758-096-3
Le, P.D., Nguyen, T.B.: OWL-based data cube for conceptual multidimensional data model // Proceedings of the First International Conference on Theories and Applications of Computer Science – ICTACS 2006, pp. 247–260. World Scientific Publishing, Ho Chi Minh (2006) ISBN 978-981-270-063-6
Viswanathan, G., Schneider, M.: BigCube:A Metamodel for Managing Multidimensional Data. In: Proceedings of the 19-th Conference on Software Engineering and Data Engineering – SEDE 2010, pp. 237–242. World Scientific Publishing, Singapore (2010) ISBN 978-981-270-063-6
Loh, Z.X., Ling, T.W., Ang, C.H., Lee, S.Y.: Adaptive Method for Range Top-k Queries in OLAP Data Cubes. In: Proceedings of International Conference on Information and Knowledge Management – CIKM02, pp. 60–67. ACM, New York (2002) ISBN 1-58113-492-4
Simic, D., Kurbalija, V., Budimac, Z.: An Application of Case-Based Reasoning in Multidimensional Database Architecture. In: Data Warehousing and Knowledge Discovery – DaWaK 2003, pp. 66–75. Springer-Verlag, Heidelberg (2003) ISBN 978-3-540-40807-9
Thanisch, P., Niemi, T., Niinimaki, M., Nummenmaa, J.: Using the Entity-Attribute-Value Model for OLAP Cube Construction. In: Perspectives in Business Informatics Research – BIR 2011, pp. 59–72. Springer-Verlag, Heidelberg (2011) ISBN 978-3-642-24510-7
Viskov, A.V., Fomin, M.B.: Methods of description of possible combinations of signs and details while using the multidimensional models in infocomm systems. T-Comm. – Telecommunications and Transport. 7, 45–47 (2012)