Internships
Summer 2024
Run OCaml Exercises Everywhere
Mentee
Divyanka Chaudhari
Mentor(s)
Cuihtlauac Alvarado <cuihtlauac@tarides.com>
Sayo Bamigbade <sayobamigbade@gmail.com>
Shakthi Kannan <shakthi@tarides.com>
At present, several groups of exercises aimed at learning OCaml are available: https://github.com/gs0510/ofronds https://github.com/sudha247/learn-ocaml-workshop https://github.com/kayceesrk/cs3100_f19 https://github.com/ocaml-sf/learn-ocaml, source for: http://ocaml-sf.org/learn-ocaml-public https://github.com/ocaml.org, source for: https://ocaml.org/exercises
There's probably more. Each set of exercises uses a different execution framework (such as Jupyter, Learn-OCaml, or custom). They provide a range of user experiences, from available solutions up to candidate solution testing. The internship's main goal is to build a common configuration that allows running each set of exercises in most known setups. This will have the following benefits:
- Exercises are no longer tied to a specific setup, they can be used in several - Loose coupling between the learning content and the technical setup - For learners: Feature parity among exercise sets, self-learning, unique, simple, and quick start - For teachers: Focus on exercise writing, using the common setup - For the community: More exercices, easier startup, improved learning curve for OCaml The goal is not to have all exercises from all the projects in all the setups. Instead, the goal is to show it is possible to do it. However, this will be established by porting a meaningful fraction of each exercise group into several setups or forks, if required. As a design constraint, we want to provide exercises at https://ocaml.org/. We consider automatically processing exercise groups to generate the https://ocaml.org/exercises page. The common setup should allow that, although it is not part of the internship.
Experimenting with an accessible diff viewer
Mentee
Alan Matthew
Mentor(s)
Paul-Elliot Anglès d'Auriac <peada@free.fr>
Jules Aguillon <juloo.dsi@gmail.com>
Looking at the difference between two files is one of the most common activity in open source work. It happens when reviewing a pull request, inspecting a commit and in many other situation. However, reading such diff is not always easy. Many tools have been to made to improve the situation and make the diff reviewing easier: The use of colours to distinguish between added and removed lines (traditionally green and red), use of a bold face to get the attention on the modified part of the line at the word level, the display of the two files side-by-side, ... However, most of the improvements mentioned above:
- Are made for "generic" diffing of files. Some files may have very long lines and be unsuitable for the current diff output. - Are restricted to sighted people. The goal of this project is to develop a visualization tool on top of Git diff. The visualization tool is a terminal UI, and the main task of the internship is to make it highly accessible to screen readers.
Winter 2023
Develop a Geometric Creative Coding Library for OCaml
Mentee
Fay Carsons
Mentor(s)
Sudha Parimala
Kaustubh M
OCaml is an industrial-strength functional programming language that's been around for nearly three decades.
While functional programming itself is not new, it has not dominated mainstram programming languages. Recently, more mainstram languages have been adopting concepts from functional programming. Now more than ever is a good time to have various types of learning material pertaining to functional programming.
Creative coding is a type of computer programming that focuses on generating artistic, expressive, and creative outputs using software and digital tools. It has its applications in places such as game development. Above all, it is a great pedagogical tool that gives visual outputs to its readers.
Joy is a tiny creative coding library in Python. Joy builds heavily on functional programming concepts with very little reference to Python syntax.
This project aims to implement a geometric creative coding library in OCaml. It is heavily inspired by Joy. When done, it will serve as a means to do geometric creative coding in OCaml.
Implement a Dark Mode for OCaml.org
Mentee
Oluwaseun Oyenuga
Mentor(s)
Sayo Bamigbade
punchagan
OCaml is a powerful, statically-typed programming language known for its efficiency and expressiveness. OCaml.org serves as the central hub for the OCaml community, providing resources, documentation, and news. In today's digital age, users expect a more personalised and comfortable web experience. One such expectation is the availability of a dark mode, which has become a popular feature on websites and applications. This project outlines the plan to implement a dark mode for OCaml.org, enhancing user experience and modernising the platform. As OCaml continues gaining traction in various industries, it is essential to modernise its online presence to meet users' expectations worldwide.
The current styles and colors for light mode already exist so implementing a dark mode will involve adding contrasting colors and styles according to the Figma design. It will also consider accessibility standards and create a button that toggles between the light and dark mode.
Improve the GUI Experience for OCaml Users
Mentee
PrincessIddy
Mentor(s)
Guillaume Petiot
Moazzam Moriani
Inspired by Rust's "Are we GUI yet?", we want the same work done on the OCaml GUI libraries. A similar work has been done for the OCaml web libraries: "Is OCaml web yet?" (see the pull request). This work would allow to tackle "Are we game yet?" in the future.
The survey must take into account the targeted platforms of these libraries, dependencies, (in)compatibilities, features, last updates, etc. A list is available in OCamlVerse but is not complete or detailed enough. Interns having previous knowledge of GUI libraries available in other languages can also compare them to the equivalent OCaml libraries.
This work must result in a guide on OCaml.org, similar to the "Is OCaml web yet?" page.
Summer 2023
Persistent Storage in MirageOS Unikernels
Mentee
PizieDust
Mentor(s)
Reynir Björnsson
Every operating system, even unikernels, need a way to persist data accross reboots. Having persistent storage capabilities in MirageOS is definitely a feature to consider including. Developing this includes building libraries for partitioning disks, filesystems for these partitions, and a simple, intuitive, and programmatic way to interact with these storage devices from a user's view point. This project pushes this vision one step further by building a library for GPT partitioning.
Improving Error Reporting in Existing PPXLIB-Based PPXs
Mentee
Abongwa Bonalais
Mentor(s)
Paul-Elliot Anglès d’Auriac
In the past, when 'ppxlib' encountered an exception in a transformation, it stopped the rewriting process, causing the rewriters after to not be processed. Also multiple errors could not be reported at the same time there were multiple failing rewriters as just the first raising rewriter will raise, and the compilation process stops there. But now, a raising rewriter does not stop the preceding rewriter from running, allowing multiple to be raised both in the context-free phase and all the other phases.
MIDI Over Ethernet With MirageOS
Mentee
Aryan Godara
Mentor(s)
Claes (rand)
Sonja Heinze
Moazzam Moriani
MIDI, which stands for Musical Instrument Digital Interface, is a widely used protocol in the world of music and audio technology. MirageOS is a library operating system that specialises in creating lightweight, secure, and efficient unikernels. Unikernels are highly-specialised, single-purpose virtual machine images designed for specific applications, and it is written in OCaml. The project focussed on implementing the rtpMIDI protocol for serialising-deserialising of MIDI messages over Ethernet and implementing use cases like a publisher-subcriber based server-client model for MIDI messages.
Winter 2022
Implement a Non-Blocking, Streaming Codec for TopoJSON
Mentee
Prisca Chidimma Maduka
Mentor(s)
Patrick Ferris
Odinaka Joy
TopoJSON is an extension to GeoJSON to encode topology. This allows for redundant data to be removed and file sizes to be greatly reduced. This is often very desirable especially when working with data in the browser. In a previous Outreachy internship, a new OCaml library was implemented to provide an OCaml library for TopoJSON. This project will build on this adding more functionality to the library and providing a non-blocking, streaming codec version similar to the geojsone library.
Summer 2022
Expand OCaml 5.0 Parallel Benchmark Suite
Mentee
Moazzam Moriani
Mentor(s)
Sudha Parimala
OCaml 5.0 will be live soon! It ships with support for shared-memory parallelism and concurrency OCaml has missed all these years. This will be accompanied by a robust set of Multicore libraries useful for parallel programming. The Multicore compiler and libraries are under active development and will continue to evolve as the OCaml ecosystem moves towards Multicore. For assessing the impact of new features in the OCaml compiler and Multicore libraries, we have a set of sequential and parallel benchmarks present in our benchmark suite. While the sequential benchmarks contain many real-world applications, a wider set of parallel benchmarks would be useful. This project entails gathering the parallel benchmarks available at various places like https://github.com/ckoparkar/ocaml-benchmarks and making them available in the benchmark suite.
Extend OCaml's GeoJSON Library to Support TopoJSON
Mentee
Jay Dev Jha
Mentor(s)
Patrick Ferris
TopoJSON is an extension to GeoJSON to encode topology. This allows for redundant data to be removed and file sizes to be greatly reduced. This is often very desirable especially when working with data in the browser. This project looks to extend ocaml-geojson
to support TopoJSON.
Winter 2021
Build a Monitoring Dashboard for OCaml.org
Mentee
Jiae Kam
Mentor(s)
Thibaut Mattio
Patrik Keller
We currently have no visibility on the performance of the server serving v3.ocaml.org, which pages are most visited, if errors happen, etc. To offer some visibility, we can implement a basic monitoring dashboard that would provide Metrics such as: Memory, CPU, Open file descriptors, Statistics such as (check if GDPR compliant first!) Requested URIs, User agents, Language, Logs. This project consists of mostly two parts: a frontend and a backend. The backend consists of building a high-level library to collect data and get statistics on them. The frontend will use this library to display graphs of the metrics, statistics, and other data we want to collect.
Improve the OCaml Meta-Programming Ecosystem
Mentee
Aya Sharaf
Mentor(s)
Shon Feder
Sonja Heinze
Patrik Keller
It's common for programming languages to provide some way to meta-program in order to preprocess code before reaching the last compilation step, for example, in the form of macros or templates. The OCaml compiler doesn't provide a full built-in macro system, but the OCaml parser does provide syntax for preprocessing purposes: attributes and extension points. We -the OCaml community- also have an official framework, called ppxlib
, to write preprocessors -called PPXs- based on that syntax and integrate them into the compilation process.
However, it's on the OCaml community to write and provide important PPXs to the OCaml developers. We've noticed that having the most important PPXs under the official PPX GitHub organisation -next to ppxlib
- is helpful. Developers can easily find them; developers can trust them; and they're well-written and hygienic, so developers can use them as how-to guides for writing other PPXs. In this project, you'll write one or some of those official standard PPXs.
Support `.eml` Files in OCaml's VSCode Extension
Mentee
Sayo Bamigbade
Mentor(s)
Thibaut Mattio
Gargi Sharma
Patrik Keller
Support .eml
files in OCaml's VSCode extension Dream, the OCaml web framework, uses .eml
files to embed HTML in OCaml files. At the moment, opening these files in VSCode, with the official OCaml VSCode extension, will not provide any syntax highlighting or diagnostics for the .eml
files, because they are not supported. The goal of the project is to add support for the syntax in the extension itself as a first step, and eventually, add support for the language in the OCaml Language Server (LSP) as a second step.
Summer 2021
Create opam Package Search
Mentee
Odinaka Joy
Mentor(s)
Sonja Heinze
Patrick Ferris
Opam is the source-based package manager for OCaml code. This project comprises of writing a new web client for rendering output from the opam package database. There is a JSON endpoint on opam.ocaml.org, which provides information about packages that would provide metadata about the packages. We can extend this JSON metadata to include all the opam packages (not just the top 10) and use that to power a search frontend for the website. This may include presenting the data as a GraphQL endpoint with the frontend querying that endpoint using GraphQL.
Improve the OCaml.org Website
Mentee
Diksha Gupta
Mentor(s)
Isabella Leandersson
Patrick Ferris
Gargi Sharma
OCaml.org is the main website for OCaml, a functional, typed, high-level programming language. This project revolves around improving the website on multiple different fronts including: layout, accessibility, and content.
Improve the OCaml.org Website
Mentee
Shreya kumari Gupta
Mentor(s)
Isabella Leandersson
Anil Madhavapeddy
Patrick Ferris
Gargi Sharma
OCaml.org is the main website for OCaml, a functional, typed, high-level programming language. This project revolves around improving the website on multiple different fronts including: layout, accessibility, and content.
Summer 2020
Reducing Global Mutable State in the OCaml Compiler Codebase
Mentee
Anukriti Kumar
Mentor(s)
Guillaume Bury
Vincent Laviron
Structured Output Format for the OCaml Compiler Messages
Mentee
Muskan Garg
Mentor(s)
Florian Angeletti
Usually, the output messages from the compiler are a bit more difficult to read for a machine, hence it's more time consuming to find the warnings, errors, etc., and their origin. By producing a structured output for compiler messages, other tools can more easily interoperate with them and provide tooling on top of the messages.
Summer 2019
Test the OCaml Compiler With Code Coverage Tools
Mentee
Oxana Kostikova
Mentor(s)
Sébastien Hinderer
Florian Angeletti
Improving the compiler testing process using code coverage tools. The core OCaml system has a large test suite, and it would be very useful to see which parts of the system are tested more actively and which are not so. Developers will be helped to see where it is needed to add new tests, and in the process of improving coverage, it is possible to find unexplored bugs and fix them. It might help to make OCaml and its libraries more reliable.
Test the OCaml Compiler With Random Tests and a Reference Interpreter
Mentee
Ulugbek Abdullaev
Mentor(s)
Gabriel Scherer
Jan Midtgaard
The aim of this project is to extend an existing testcase-generator for the OCaml compiler, using a reference interpreter (existing or newly developed) to find a lot of bugs in the compiler and fix as much of them as possible.
Summer 2016
MirageOS
Mentee
Gina Marie Maini
Mentor(s)
Mindy Preston
Winter 2015
NTP Support for MirageOS
Mentee
Kia
Mentor(s)
Hannes Mehnert
Summer 2014
MirageOS Contributions and Improvements
Mentee
Mindy Preston
Mentor(s)
Richard Mortier
Anil Madhavapeddy
MirageOS Cloud API Support
Mentee
Jyotsna Prakash
Mentor(s)
David Scott
Anil Madhavapeddy
MirageOS (see http://xenproject.org/developers/teams/mirage-os.html, http://www.openmirage.org/) is a type-safe unikernel written in OCaml. It generates highly specialised "appliance" VMs that run directly on Xen without requiring an intervening kernel. A MirageOS application typically runs via several communicating kernel instances on the cloud. Today these instances are difficult to manage; we would like to explore strategies for managing these distributed computations using common public cloud APIs, such as those exposed by Amazon EC2 and Rackspace. First we need to create pure OCaml API bindings for (e.g.) EC2 and Rackspace (purity is needed to ensure portability). These API bindings can then be used to provide operating-system-level abstractions to the unikernels. For example, a traditional VM might hotplug a vCPU; while a MirageOS application would request a "VM create" using the cloud API and "connect" the new instance to the existing network. We should be able to spin up 1000s of "CPUs" by using such APIs in a cluster environment. As well as helping Xen/Mirage, the public cloud API bindings will be very useful to other people in other contexts -- a nice side-effect.
Outreachy
Upcoming OCaml Outreachy Internships
These internships are specifically designed for people subject to systemic bias and impacted by underrepresentation in the technical industry where they are living. Learn more about opportunities and eligibility at Outreachy.