BALab yearly reports

Members

Faculty Members

Damianos Chatziantoniou

Maria Kechagia

Dimitris Mitropoulos

Panos (Panagiotis) Louridas

Diomidis Spinellis

Senior Researchers

Nikolaos Alexopoulos

Vaggelis Atlidakis

Makrina Viola Kosti

Vasiliki Efstathiou

Stefanos Georgiou

Thodoris Sotiropoulos

Marios Fragkoulis

Associate Researchers

Charalambos-Ioannis Mitropoulos

Zoe Kotti

Konstantinos Kravvaritis

Stefanos Chaliasos

Antonios Gkortzis

Tushar Sharma

Konstantina Dritsa

Researchers

Andriani Nikolaou

Efthymios Kontoes

Panagiotis Daskalopoulos

Georgios Liargkovas

Ioanna Moraiti

Evangelos Talos

Giannis Karyotakis

Ilias Mpourdakos

Apostolos Garos

Chris Lazaris

Rafaila Galanopoulou

Evangelia Panourgia

Christina Zacharoula Chaniotaki

Georgios - Petros Drosos

George Theodorou

Christos Pappas

Angeliki Papadopoulou

George Metaxopoulos

Theodosis Tsaklanos

Michael Loukeris

Marios Papachristou

Christos Chatzilenas

Ioannis Batas

Efstathia Chioteli

Vitalis Salis

Overview in numbers

New Publications	Number
Monographs and Edited Volumes	0
PhD Theses	0
Journal Articles	5
Book Chapters	0
Conference Publications	5
Technical Reports	0
White Papers	0
Magazine Articles	0
Working Papers	0
Datasets	0
Total New Publications	10
Projects
New Projects	1
Ongoing Projects	1
Completed Projects	0
Members
Faculty Members	5
Senior Researchers	7
Associate Researchers	7
Researchers	25
Total Members	44
New Members	7
PhDs
Ongoing PhDs	4
Completed PhDs	0
New Seminars
New Seminars	10

New Publications

Journal Articles

Diomidis Spinellis. Open reproducible scientometric research with Alexandria3k. PLOS ONE, 18(11):e0294946, November 2023.

Diomidis Spinellis. Commands as AI conversations. IEEE Software, 40(6):22–26, November 2023.

Zoe Kotti, Georgios Gousios, and Diomidis Spinellis. Impact of software engineering research in practice: A patent and author survey analysis. IEEE Transactions on Software Engineering, 49(4):2020–2038, April 2023.

Zoe Kotti, Rafaila Galanopoulou, and Diomidis Spinellis. Machine learning for software engineering: A tertiary study. ACM Computing Surveys, 55(12):1–39, March 2023.

Christof Ebert and Panos Louridas. Generative AI for software practitioners. IEEE Software, 40(4):30–38, July 2023.

Conference Publications

Charalambos Mitropoulos, Thodoris Sotiropoulos, Sotiris Ioannidis, and Dimitris Mitropoulos. Syntax-aware mutation for testing the Solidity compiler. In 28th European Symposium on Research in Computer Security, ESORICS '23. September 2023.

Georgios Liargkovas, Konstantinos Kallas, Michael Greenberg, and Nikos Vasilakis. Executing shell scripts in the wrong order, correctly. In Proceedings of the 19th Workshop on Hot Topics in Operating Systems, HOTOS '23, 103–109. New York, NY, USA, 2023. Association for Computing Machinery.

Jesse Harte, Wouter Zorgdrager, Panos Louridas, Asterios Katsifodimos, Dietmar Jannach, and Marios Fragkoulis. Leveraging large language models for sequential recommendation. In Proceedings of the 17th ACM Conference on Recommender Systems. Late-breaking Results Papers., RecSys '23, 1096–1102. 2023.

Apostolos P. Fournaris, Christos Tselios, Evangelos Haleplidis, Elias Athanasopoulos, Antreas Dionysiou, Dimitrios Mitropoulos, Panos Louridas, Georgios Christou, Manos Athanatos, George Hatzivasilis, Konstantinos Georgopoulos, Costas Kalogeros, Christos Kotselidis, Simon Vogl, Francois Hamon, and Sotiris Ioannidis. Providing security assurance & hardening for open source software/hardware: the secopera approach. In 2023 IEEE 28th International Workshop on Computer Aided Modeling and Design of Communication Links and Networks (CAMAD), 80–86. 2023.

Stefanos Chaliasos, Marcos Antonios Charalambous, Liyi Zhou, Rafaila Galanopoulou, Arthur Gervais, Dimitris Mitropoulos, and Ben Livshits. Smart contract and DeFi security: insights from tool evaluations and practitioner surveys. In The Science of Blockchain Conference 2023, SBC '23. September 2023.

Projects

New Projects

SecOPERA - Secure OPen source softwarE and hardwaRe Adaptable framework

Ongoing Projects

HFRI III (PhD Scholarship) - Data Analysis Applications in Software Engineering

New Members

Andriani Nikolaou

Efthymios Kontoes

Panagiotis Daskalopoulos

Ioanna Moraiti

Evangelos Talos

Giannis Karyotakis

Ilias Mpourdakos

Ongoing PhDs

Zoe Kotti Topic: Data Analysis Applications in Software Engineering

Konstantinos Kravvaritis Topic: Data and Quality Metrics of System Configuration Code

Antonios Gkortzis Topic: Secure Systems on Cloud Computing Infrastructures

Konstantina Dritsa Topic: Data Science

Seminars

Impact of Software Engineering Research in Practice: A Patent and Author Survey Analysis

Date: 09 January 2023
Presenter: Zoe Kotti
Abstract

Existing work on the practical impact of software engineering (SE) research examines industrial relevance rather than adoption of study results, hence the question of how results have been practically applied remains open. To answer this and investigate the outcomes of impactful research, we performed a quantitative and qualitative analysis of 4,354 SE patents citing 1,690 SE papers published in four leading SE venues between 1975-2017. Moreover, we conducted a survey on 475 authors of 593 top-cited and awarded publications, achieving 26% response rate. Overall, researchers have equipped practitioners with various tools, processes, and methods, and improved many existing products. SE practice values knowledge-seeking research and is impacted by diverse cross-disciplinary SE areas. Practitioner-oriented publication venues appear more impactful than researcher-oriented ones, while industry-related tracks in conferences could enhance their impact. Some research works did not reach a wide footprint due to limited funding resources or unfavorable cost-benefit trade-off of the proposed solutions. The need for higher SE research funding could be corroborated through a dedicated empirical study. In general, the assessment of impact is subject to its definition. Therefore, academia and industry could jointly agree on a formal description to set a common ground for subsequent research on the topic.

Licensing issues in open source software

Date: 01 March 2023
Presenter: Georgia Kapitsaki
Abstract

Georgia (Zeta) M. Kapitsaki is an Associate Professor at the Department of Computer Science, UCY. She received her Ph.D. degree in Electrical and Computer Engineering from the School of Electrical and Computer Engineering of National Technical University of Athens (NTUA), her M.Sc. in Technoeconomical Systems and her Diploma in Electrical and Computer Engineering. She has worked as a software engineer in Germany, as a research associate at NTUA and as a laboratory assistant at the Technical Institute of Piraeus. She has been the principal investigator of national and EU funded research projects (e.g. SocioCoast, CYberSafetyIII) and has worked on EU projects (e.g. PaaSage, VALS). She has been involved in the organization of international conferences (e.g. ICSME 2022, SAC 2019). She has served as member of the program committee of international conferences (e.g. WISE 2022, ICWS 2022, ENASE 2022). She is a member of the editorial board of ERCIM News. She has published over 60 papers in peer-reviewed journals, and scientific conferences and workshops. Her research interests include Software Engineering, Open source software and reuse, Privacy Enhancing Technologies and Context-aware Applications. She is a faculty member of the Software Engineering and Internet Technologies Laboratory (SEIT).

The research and academic environment in the United States

Date: 29 May 2023
Presenter: Georgios Liargkovas
Abstract

The research and academic environment in the United States is a topic of great interest to students and researchers worldwide. In this presentation, as an exchange student who recently returned from a program at Brown University, I will share my experience and provide (I think valuable) insights regarding the research environment, life in the US, and significant lessons I learned during this period.

The presentation will begin with an introduction, covering the basic details of the program. Details about the US research environment, student life, challenges faced, and lessons learned will be then discussed. Finally, a brief reference will be made to the recently-accepted paper at HotOS '23 "Executing Shell Scripts in the Wrong Order, Correctly", which will be presented in June.

Syntax-Aware Mutation for Testing the Solidity Compiler

Date: 20 September 2023
Presenter: Charalambos Mitropoulos
Abstract

We introduce Fuzzol, the first syntax-aware mutation fuzzer for systematically testing the security and reliability of solc, the standard Solidity compiler. Fuzzol addresses a challenge of existing fuzzers when dealing with structured inputs: the generation of inputs that get past the parser checks of the system under test. To do so, Fuzzol introduces a novel syntax-aware mutation that breaks into three strategies, each of them making different kind of changes in the inputs. Moreover, to explore new paths in the compiler’s codebase faster, we introduce a mutation strategy prioritization algorithm that allows Fuzzol to identify and apply only those mutation strategies that are most effective in exercising new inter- esting paths. To evaluate Fuzzol, we test 33 of the latest solc stable re- leases, and compare fuzzol with (1) Superion, a grammar-aware fuzzer, (2) AFL-compiler-fuzzer, a text-mutation fuzzer and (3) two grammar-blind fuzzers with advanced test input generation schedules: AFLFast and MOpt-AFL. fuzzol identified 19 bugs in total (7 of which were previously unknown to Solidity developers), while the other fuzzers missed half of these bugs.

This work has been accepted at ESORICS '23 and will be presented in September 2023.

SCALE-BOSS: A framework for scalable time-series classification using symbolic representations

Date: 04 October 2023
Presenter: Apostolos Glenis
Abstract

Time-Series Classification (TSC) is an important problem in many fields across sciences. Many algorithms for TSC use symbolic representation to combat noise. In this paper we propose a framework, namely SCALE-BOSS, to build TSC algorithms that exploit time-series models based on symbolic representations. While alternative symbolic representations can be incorporated, we have opted to use the Bag-Of-SFA (BOSS) approach, and thus SFA, as a state-of-the-art symbolic time series representation. We investigate the efficiency of several instantiations of this framework based on two main variations, where the TSC model is built either by a time-series classification or by a clustering algorithm. The objective is to advance the computational efficiency of TSC classification algorithms without sacrificing their accuracy. We evaluate the instantiations of the SCALE-BOSS framework on those datasets in the UCR time-series repository that include the largest training sets. Comparisons with state of the art methods on TSC show the balance between computational efficiency and accuracy on predictions achieved.

Machine Learning for Software Engineering: Current State, Opportunities, and Challenges

Date: 16 October 2023
Presenter: Tushar Sharma, Dalhousie University
Abstract

Machine Learning (ML) has emerged as a transformative force across various domains, and Software Engineering is no exception. In this talk, we will explore how ML is currently being applied in software engineering, delve into the exciting opportunities it offers, and discuss the key challenges that must be overcome to unlock its full potential. Using a recently completed survey paper as a reference, Tushar will discuss the fine-grained tasks, tools, and techniques for applying ML in various software engineering tasks. Tushar will provide an overview of the recent research studies from his lab in this direction. The talk will offer an opportunity to discuss the rapidly changing software engineering landscape due to recent disruptive advancements in ML.

Tushar Sharma is an assistant professor at Dalhousie University, Canada. The topics related to software design and architecture, refactoring, software code quality, technical debt, and machine learning for software engineering (ML4SE) define his career interests. He earned a PhD from Athens University of Economics and Business, Athens, Greece, specializing in software engineering in May 2019. Earlier, he obtained an MS in Computer Science from the Indian Institute of Technology-Madras, Chennai, India. His professional experience includes working with Siemens Research (formally, Siemens Technology), Charlotte, USA for approximately two years (2019-2021) as well as Siemens Corporate Technology, Bangalore, India for more than seven years (2008-2015). He was the principal investigator for MINDSIGHT team of DARPA AMP program consisting of researchers from Siemens, JHU/APL, BAE systems, and UC Irvine. He co-authored Refactoring for Software Design Smells: Managing Technical Debt and two Oracle Java certification books. He founded and developed Designite which is a software design quality assessment tool used by many practitioners and researchers worldwide. He is an IEEE Senior Member.

Leveraging Large Language Models for Sequential Recommendation

Date: 01 November 2023
Presenter: Panos Louridas
Abstract

Sequential recommendation problems have received increasing attention in research during the past few years, leading to the inception of a large variety of algorithmic approaches. In this work, we explore how large language models (LLMs), which are nowadays introducing disruptive effects in many AI-based applications, can be used to build or improve sequential recommendation approaches. Specifically, we devise and evaluate three approaches to leverage the power of LLMs in different ways. Our results from experiments on two datasets show that initializing the state-of-the-art sequential recommendation model BERT4Rec with embeddings obtained from an LLM improves NDCG by 15-20% compared to the vanilla BERT4Rec model. Furthermore, we find that a simple approach that leverages LLM embeddings for producing recommendations, can provide competitive performance by highlighting semantically related items. We publicly share the code and data of our experiments to ensure reproducibility.

ai-cli-lib: A command-line copilot

Date: 13 November 2023
Presenter: Diomidis Spinellis
Abstract

Developers, system administrators, and data scientists often struggle with the powerful yet often cryptic command-line interfaces. The solution? ai-cli-lib, an open-source library that converts natural language prompts into executable commands for diverse command-line tools. Its operation is based on dynamic linking, configurable AI API interfaces, and dynamic prompt engineering. The talk introduces ai-cli-lib as an AI-based productivity booster for software developers, presents an overview of building an AI-enabled software product, and discusses the use of AI in software development informed through the 261 ChatGPT interactions that aided ai-cli-lib's development.

A systematic review of datasets for intrusion detection systems

Date: 13 December 2023
Presenter: Ilias Balampanis
Abstract

Intrusion Detection Systems (IDS) based on Machine Learning (ML) techniques are essential for cybersecurity, utilizing datasets to detect and mitigate malicious network or system activities. The efficacy of IDSs is contingent on datasets that are both extensive and representative of actual cyber threats, ensuring accurate and robust system performance evaluation. This paper reports on a systematic literature review (SLR) that examines the landscape of datasets for IDS training and evaluation. The SLR explores the traits, creation methods, and constraints of these datasets, alongside the identification of challenges in their generation and utility. Highlighted is the variance in dataset quality, the gradual pace of development, and the ongoing deficit of datasets in the intrusion detection domain. With the continuous evolution of cyber threats, a persistent reassessment of these datasets is imperative for maintaining their pertinence and efficacy. Our SLR aggregates and dissects the extant research to elucidate the strengths and weaknesses of current datasets, determining their aptness for varied IDS contexts. The objective is to delineate the present state of IDS datasets, suggest measures for their enhancement, and propose future research directions. We emphasize the need for standardized, comprehensive, and ethically sound datasets that reflect the evolving threat landscape. Future research should focus on data augmentation to cover a broader spectrum of attack scenarios, improving the robustness of IDS against diverse threats. Additionally, fostering open-source datasets and cross-sector collaboration is crucial for integrating practical, real-world cybersecurity challenges into academic research, thereby democratizing access and promoting innovation in IDS development.

AI Open Seminar by UniAI

Date: 15 December 2023
Presenter: D. Karlis, G. Vouros, and G. Giannakopoulos
Abstract

AI in Sports Analytics, Dimitrios Karlis, Professor of Statistics at AUEB and member of AI@AUEB.
Learning to fly with AI: Tales on Deep Reinforcement Learning and Generative AI, George Vouros, Professor at University of Piraeus and EETN President.
The myth of omnipotent AI George Giannakopoulos, researcher at "SKEL LAB – Demokritos", scientific coordinator at "ahedd – DIH" and co-founder of SciFy.

Note: Data before 2017 may refer to grandparented work conducted by BALab's members at its progenitor laboratory, ISTLab.

Yearly Report 2023