Assignment: DSL Design Methodology
Assignment: DSL Design Methodology
Assignment: DSL Design Methodology
Assignment: DSL Design Methodology
Permalink:
A Survey on Domain-Specific Languages for Machine.pdf
A Survey on Domain-Specific Languages for Machine
Learning
August 3, 2017
CISC 603-50- R-2017/Summer Theory of Computation
Student: Dileep Sharma
Instructor: Majid Shaalan
Contents
1 Statement 2
2 Abstract 2
3 Introduction 2
3.1 Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
3.2 Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3.3 Domain Specific Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4 DSL Feature Model 4
4.1 Language Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
4.2 Transformation Feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.3 DSL Tool Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4.4 DSL Process Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5 Languages Surveyed 9
5.1 OptiML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.2 ScalOps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.3 Scala . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.4 PIG LATIN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
5.5 Breukervl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
5.6 Possibility of survey of other language . . . . . . . . . . . . . . . . . . . . . 11
5.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
6 Reference 12
1
1 Statement
The purpose of this paper is to identify, describe and design Domain Specific Language(
DSL ) applicable to Machine learning world in big data space, that can make process more
faster and efficient.
2 Abstract
In last couple of decades, the data we have at our disposal have increased tremendously
because of technology advance. This technology advance has helped us in capturing, storing,
analogizing and visualizing data and that has lead to big data. We need better algorithm
to read and analyze these big and complex diastases. Machine Learning is turning out to be
the most effective way of analyzing these datasets and predicting future behavior. To better
analyze these datasets with Machine Learning we need enhanced computational power, that
can be obtained using parallel processing using GPUs. Machine Learning algorithms needs
to be adapted and optimized to specific applications. However, programming these devices
to run efficiently and correctly is difficult, error-prone, and results in software that is harder
to read and maintain. This paper is primarily concern about Domain Specific language that
can help us in writing Machine Algorithms in efficient way to analyze Big Data.
3 Introduction
Technological advance in recent past has caused a data revolution. This high volume of data
is called big data.Every second, smartphones, tablets,cars, websites, and systems generate
a massive amount of data, and users and software engineers have access to a subset of that
data to perform their activities.
3.1 Big Data
Apart for large amount of data, Big data also accounts for complex data, known as vari-
ety.Big Data also created new challenges in data management. Traditional ways of data
storage and analysis do not scale well to this amount of data, which can reach hundreds
of terabytes or more, and new approaches are being developed to address these issues Big
data is basically defined by 5Vs:
Volume
Refers to amount of data
Big Data doesnt sample
2
Big Data observes and tracks what happens
Velocity
Speed of data processing
Speed of data generation
Big Data is often available in real-time
Variety
Number of types of data
Variability
Inconsistency of data
Veracity
Quality of data
3.2 Machine Learning
Machine learning is turning out to be one of the most advanced technique to process and
make inferences from Big Data. Machine Learning is widely used to discover identify trends,
patterns, suggest actions, and optimize output. There are still a lot of challenges in using
Machine Learning to solve big data problems, such as memory and time issues. To resolve
these issues, we can use GPUs for parallel processing and scatter data across different
machines. There are basically two kind of Machine Learning:
Supervised
All data is labeled
You have both Input variable and Output variable
Use an algorithm to learn the mapping function from the input to the output
Unsupervised
All data is unlabeled
You only have input data and no corresponding output variables
Algorithm try to find pattern in input data
3
3.3 Domain Specific Language
In model-driven engineering, a Domain-Specific Language (DSL) is a specialized language,
which, combined to a transformation function, serves to raise the abstraction level of soft-
ware and ease software development. The Machine Learning Implementation can be made
better by using techniques such as Domain-Specific. DSL solves problem in a single domain
while General Purpose Languages(GSL) solves problems in a couple of domains. DSLs
facilitate results to be expressed in the idiom and at the level of abstraction of the prob-
lem domain Language.DSLs offer pre-defined abstractions to represent concepts from the
application domain. This representation may be more clear and intuitive. Moreover, DSL
compilers may optimize the code written for the specific domain, and they can perform
error detection more efficiently. Lastly, DSLs may have more specific tool support that help
software engineers increase their productivity. These languages are easier to learn. There
can be three kind of DSL languages:
Markup language
Specification Language
Programming Language
4 DSL Feature Model
DSl Feature model covers languages, transformation, tooling, and process aspects 1. Lan-
guage and transformation are mandatory features because they are parts of the DSL defini-
tion. Tool is also mandatory because it serves to automate transformation from a domain,
the problem space, down to lower abstraction levels, the solution space. Process is optional
because it can be undefined or implicit.
4.1 Language Features
There are two language features called as2:
Abstract Syntax
Characterizes elements of a domain and their relationships without implementa-
tion consideration
Concrete Syntax
Representation of a DSL in a human usable form
4
Figure 1: Roots of DSL Feature Language
Figure 2: Language Feature
5
Figure 3: Root of the Transformation Features
4.2 Transformation Feature
Transformation feature ensures the correspondence from the problem to the solution, takes
into account the problem-to-solution element mapping, and all design, implementation,
platform and architecture decisions. Transformation has to answer to three questions3:
How to specify transformation4? What are the assets expected from the transformation5?
How to realize the transformation to produce the expected assets? 6.
Use Promo Code: FIRST15