Die signaturbasierte Erkennung wusste schon immer, wonach sie suchte. Maschinelles Lernen und autonome Agenten verändern die Frage völlig und verschieben sich von „Entspricht dies einem bekannten Muster?“ zu „Macht das im Kontext tatsächlich Sinn?“
Blog
-
When the Sensor Starts Thinking: SnortML, Agentic AI, and the Evolving Architecture of Intrusion Detection
Signature-based detection has always known what it was looking for. Machine learning and autonomous agents are changing the question entirely, shifting from “does this match a known pattern?” to “does this actually make sense in context?”
-
When the Sensor Starts Thinking: SnortML, Agentic AI, and the Evolving Architecture of Intrusion Detection
Signature-based detection has always known what it was looking for. Machine learning and autonomous agents are changing the question entirely, shifting from "does this match a known pattern?" to "does this actually make sense in context?"
-
How researchers are using GitHub Innovation Graph data to reveal the “digital complexity” of nations
One of our goals for the GitHub Innovation Graph was to facilitate research on the economic impact of open source software and developer collaboration. In a paper recently published by Research Policy, four researchers used Innovation Graph data to do just that. I’m happy to share an interview with these researchers, along with our Q4 2025 data release.
The Research Policy paper examines whether the geography of open-source software production on GitHub can reveal the “digital complexity” of nations, and whether that complexity predicts GDP, inequality, and emissions in ways that traditional economic data misses.
Meet the four researchers:
- Sándor Juhász is a research fellow at the Corvinus University of Budapest. His work focuses on economic geography, knowledge networks, and how spatial structures shape innovation.
- Johannes Wachs is an Associate Professor at Corvinus University of Budapest, Director of the Center for Collective Learning at the Corvinus Institute of Advanced Study, and a researcher at the Complexity Science Hub in Vienna. His work sits at the intersection of computational social science and economic geography, with a particular focus on open-source software communities.
- Jermain Kaminski is an Assistant Professor at the School of Business and Economics at Maastricht University. His research specializes in entrepreneurship, strategy, and causal machine learning, with a focus on how data-driven methods can improve decision-making and innovation. He is a cofounder of the Causal Data Science Meeting.
- César A. Hidalgo is a professor at the Toulouse School of Economics and Corvinus University of Budapest, and he is the Director of the Center for Collective Learning. He is also the creator of the Observatory of Economic Complexity and cofounder of DataWheel.
Research Q&A
Kevin: Thanks so much for chatting, everyone! Could you give a quick high-level summary of the paper for our readers here?
Sándor: For the last fifteen years or so, economists have been measuring the complexity of national economies by looking at what physical products countries export, what patents they file, and what research they publish. These measures turn out to be remarkably good at predicting which countries will grow, which have high inequality, amongst many other macroeconomic features. But they all have a massive blind spot: software.
Jermain: Code doesn’t go through customs. It crosses borders through “
git push”, cloud services, and package managers. So all that productive knowledge was essentially invisible, what some colleagues have called the “digital dark matter” of the economy. We decided to fix that using the GitHub Innovation Graph, which tracks how many developers in each economy push code in each programming language, based on IP addresses. We applied the Economic Complexity Index (ECI) to this data. The bottom line is that software ECI surfaces new information that trade flows, patents, and research data partly leave on the table. In particular, software ECI helps explain variation in GDP per capita and income inequality even after you control for all the traditional measures.Johannes: We also found that countries don’t jump randomly between software specializations. They diversify into technology stacks that are related to what they already do, just like countries in the physical economy tend to move into products similar to what they already export. This is considered the “principle of relatedness,” and it holds for software too.
Kevin: Interesting! Could you provide an overview of the methods you used in your analysis?
Johannes: Sure. As mentioned, the core data comes from the GitHub Innovation Graph, which gives us quarterly counts of developers pushing code by economy and programming language for 163 economies and 150 languages from 2020 to 2023. But individual programming languages aren’t really the right unit, most real software uses bundles of languages together. A web app might combine HTML, CSS, and JavaScript; a data science project uses Python and Jupyter Notebook; systems programming pairs C with Assembly.
Sándor: So we built a separate dataset by querying the GitHub GraphQL API for all repositories active in 2024 to find which languages co-occur within the same repos. We computed cosine similarity between languages based on weighted co-occurrence, with a normalization scheme so that polyglot repos with twenty languages don’t dominate the signal, and then applied hierarchical clustering to group the 150 languages into 59 “software bundles.” Each bundle represents a coherent technology stack.
Jermain: …and from there, it’s the “standard” economic complexity pipeline. We build a country-by-bundle matrix, compute revealed comparative advantage, essentially asking, “does this country have a disproportionate share of developers in this bundle relative to the global average?”, binarize it, and then apply the iterative method to compute the Economic Complexity Index. Countries that specialize in many non-ubiquitous bundles score high, and countries that only specialize in things everyone does score low. For the relatedness analysis, we define proximity between bundles using co-specialization patterns. If countries that are good at bundle A also tend to be good at bundle B, those bundles are close in the software space. Then we test whether countries are more likely to enter bundles that are close to their existing specializations.
Kevin: Nice! Follow-up question: could you provide an “explain it like I’m five” overview of the methods you used in your analysis?
César: Think of countries like kitchens. Some kitchens can cook anything, since they have an abundance of ingredients and tools, from the rarest spices to the best knives. Others are more limited. Maybe they can boil rice and do a few other simple things. Since we cannot look at the kitchens directly, we need to infer their “complexity” based on the dishes they are able to produce. This is what the economic complexity index or ECI allows you to estimate. We can infer what’s going on in the kitchen by seeing if it is a chicken and rice operation, or a place that can produce sophisticated edible foams and souffles. Originally, these methods were applied to trade data, where the dishes coming out of the kitchen were a country’s exports, but in this paper, we applied that to software. A chicken-and-rice country is a Python and JavaScript country. A Michelin-star country is one that can program certified embedded systems for aerospace and defense.
Top 20 economies by software economic complexity
Ranking Economy Software ECI 1 Germany 1.739 2 Australia 1.730 3 Canada 1.729 4 Netherlands 1.727 5 France 1.702 6 United States 1.695 7 Poland 1.691 8 United Kingdom 1.687 9 Italy 1.672 10 Sweden 1.620 11 Switzerland 1.620 12 Hong Kong SAR 1.595 13 Norway 1.571 14 Japan 1.552 15 Spain 1.552 16 Russia 1.530 17 Singapore 1.468 18 Taiwan 1.464 19 Belgium 1.448 20 Finland 1.444 Kevin: Thanks, that’s super helpful. I’d be curious about the limitations of your paper and data that you wished you had for further work. What would the ideal datasets look like for you?
Johannes: One major drawback is that we only see public GitHub activity. That means we’re missing proprietary software entirely. Hence, we can’t see closed-source enterprise work, which is huge. So our measure likely underestimates software complexity in countries with a weaker open source software culture.
Sándor: The time window is another constraint. Four years of data (2020–2023) is enough for cross-sectional analysis but too short to credibly test long-run growth predictions, which is what economic complexity measures are really designed for. Economic structures shift over decades, not quarters. We’d love to have twenty years of this data.
Jermain: The dream dataset would combine GitHub-like activity data with information about the projects themselves, not just languages, but frameworks, libraries, and what the software actually does. Considering this dimension would be a natural next step for our project, and it would shed more light into software bundles and use cases. If we knew that a repo was building a fintech application versus a game engine, we could define much finer-grained capability bundles. GitHub Topics gives us a taste of this, and we used it as a robustness check, but it’s still noisy and incomplete.
Kevin: Do you have any predictions for the future? Recommendations for policymakers? Recommendations for developers?
César: Software is an interesting target for industrial policy because it is an industry that depends primarily on highly movable human capital (software developers). In principle, it provides an opportunity for development that can be incentivized via talent attraction programs. In practice, however, the high mobility of software talent can be a double-edged sword, since that makes it sensitive to consumer protection regulations that make it hard to work with data or worker protection schemes that distribute the risk of innovation to small and medium size firms (e.g. laws that on paper protect workers, but that in reality pass on that responsibility to the firms). The countries that figure out how to attract software talent without suffocating it with well-intentioned but poorly designed regulation will pull ahead.
Johannes: For developers, understanding that places are highly specialized in the kind of software they produce is useful when they are looking to relocate. Developers can use the product space representation of software capabilities to know which countries their skillsets are a good match for.
Jermain: Looking ahead, the big question is what generative AI does to this picture. If AI coding assistants lower the barrier to working in new programming languages, does relatedness weaken? Do countries diversify faster? Or does it reinforce existing advantages because the countries with the best AI infrastructure benefit most? We’re working on this, and Johannes and his colleagues have a new paper in Science on tracking the global diffusion of AI-assisted coding on GitHub. I think the answer will reshape how we think about digital complexity within the next five years. One further consideration would be how classifications of software or software bundles would be represented as NAICS or NACE industry codes.
Sándor: I’d add a prediction: I think we’ll see economic complexity indices based on software data become a standard part of the policymaker’s toolkit within the decade, sitting right alongside the trade-based measures. The data is open, it updates quarterly, and it captures something that traditional data genuinely can’t.
Personal Q&A
Kevin: I’d like to change gears a bit to chat more about your personal stories. Johannes, I understand that you have a background in computational social science and network science, which is a bit different from the traditional economics path. Tell us more about your path to research.
Johannes: I actually started in mathematics and then moved into computational social science during my PhD at Central European University in Budapest. I became enchanted by the opportunities that digital data traces present for studying human behavior. I like using network methods because they help us move between the micro level activity and interactions found in such traces and the macro outcomes. I stumbled into open source research in particular when I realized that GitHub data was this incredibly rich, publicly available record of valuable knowledge production that few people were using to study social science questions.
Kevin: Sándor, I see you have a background in economic geography, which is a more traditional route compared to computational social science. What was your path toward working with software data?
Sandor: I received my PhD in economic geography at Utrecht University, in a research community that was already using economic complexity to study regional development. So I was trained in thinking about places—cities, regions, industries—through the lens of networks and capability accumulation.
Kevin: Jermain, it looks like you developed practical technical expertise through some entrepreneurial projects in parallel with academic training.
Jermain: During my PhD at RWTH Aachen, I was a visiting researcher with Cèsar at MIT. In that time, I was also working with a colleague on a project called Moviegalaxies.com (open data) and later worked on analyzing text, speech and video data in Kickstarter projects. It was my first multimodal machine learning pipeline. From my network analysis projects, I somehow ended up analyzing passing networks for a larger German soccer team. These days my research is mostly concerned with causality and causal machine learning. In this capacity, I co-founded the Causal Data Science meeting with my colleague Paul Hünermund.
Kevin: César, do I have right that you have a background in Physics?
César: I started in physics, with a PhD at Notre Dame focused on complex networks. During that time, I realized that network tools could be used to describe the evolution and fate of economies. Eventually, this became a field that we know today as economic complexity, which studies the process of economic development by using tools from physics, economics, and computer science.
Kevin: Finding a niche that you’re passionate about is such a joy, and I’m curious about how you’ve found living in that niche. What’s the day-to-day like for you?
Johannes: Honestly, in research, the day-to-day is a mix of writing code, writing papers, and talking to people, then iterating. Of course, working at a university usually comes with teaching and administration, too. I like that I have a good amount of freedom in what I choose to work on. If a project or direction doesn’t spark joy, I can usually shift my focus. That is a unique thing.
Sándor: I’d add that one of the best parts of this niche is the interdisciplinary community. On any given week I might talk to an economic geographer, a computer scientist, and a physicist about the same research question. That’s unusual and very stimulating.
Kevin: Have things changed since generative AI tooling came along? Have you found generative AI tools to be helpful?
Johannes: Absolutely. We use LLM tools regularly now for things like debugging data pipelines, drafting boilerplate code, and even sanity-checking statistical approaches. It’s particularly useful in a project like where you have a lot of different methods and need to coordinate work in a team. That said, LLMs are much more helpful if you already have a clear idea in mind.
Kevin: Do you have any advice for folks who are starting out in software engineering or research? What tips might you give to a younger version of yourself, say, from 10 years ago?
César: The key is to invest in things that grow or compound. This is easier said than done because there are always distractions and temptations. I’ve seen many scholars spend months or years working on projects just because they don’t want to lose the work that they’ve already put into them. The cost of doing that is working on other projects that might matter more in ten or twenty years. Building tools that can generate an audience, like The Observatory of Economic Complexity, Data USA, or Pantheon, was challenging, but they have borne fruit for a long time. The same is true about working on a few important papers or completing a book. The question you need to ask when working on a project is whether you honestly believe that the project will be more important in a decade from now than today. If the answer is yes, that’s probably a good project. Ten years ago, I would have told myself to trust that test more and to walk away from “almost done” projects faster. Sunk costs are the most expensive thing in a research career.
Johannes: In can rather make suggestions for young researchers. The first is to build a broad question and research agenda to motivate what you do. You have to have a problem you care about so much that even partial or highly specific results about that problem get you excited. Once you have that, in practice I think there is a lot of value in generating your own data. I prefer applying a straightforward method to a bespoke dataset than applying a highly complex method to a dataset everyone knows.
Jermain: My advice echoes César’s: don’t ride a dead horse. In the years after the PhD and into assistant professorship, it’s tempting to keep milking old topics while pivoting to new ones, but this leaves you straddling two worlds and mastering neither. Pick your focus deliberately, narrow enough to build real expertise, broad enough to stay curious, and be willing to let go of past work that no longer aligns, even if it feels wasteful.
Sándor: I’d tell my younger self to collaborate more and earlier. This paper has four authors across five institutions in four countries. That wouldn’t have happened if any of us had stayed in our silos. Go to conferences outside your field, say yes to coffee meetings with people whose work seems tangentially related, and don’t be afraid to cold-email researchers whose work you admire.
Kevin: Are there any learning resources you might recommend to someone interested in learning more about this space?
César: The Observatory of Economic Complexity, for a web experience, and The Infinite Alphabet: and The Laws of Knowledge, for a book that puts this in context.
Jermain: If you’re a developer curious about the economics angle, I’d honestly just recommend browsing the Observatory of Economic Complexity and looking up your own country. See what it exports, where it sits in the product space, and then think about how software fits in. It’s a very intuitive way to build the intuition before diving into the math.
Kevin: Thank you, Sándor, Johannes, Jermain, and César! It’s been fascinating to learn about your current work and broader career trajectories. We truly appreciate you taking the time to speak with us and will absolutely keep following your work.
The post How researchers are using GitHub Innovation Graph data to reveal the “digital complexity” of nations appeared first on The GitHub Blog.
-
Why age assurance laws matter for developers
Policymakers around the world are advancing age assurance proposals to protect children and teens online. Some approaches restrict minors’ access to certain services or content, while others would require devices, operating systems, or app stores to collect age information and pass age signals to apps and websites. These proposals are driven by serious concerns, but without appropriate scoping, they risk imposing burdensome requirements on open source software and developer infrastructure services that do not present the same risks to minors as consumer-facing platforms. In this blog post, we’ll provide an overview of what developers should know and how to engage.
The harms these laws aim to address are serious and deserve attention. Grooming for sexual purposes, exposure to violent content, and online bullying are just some of the risks young people are facing online. At the same time, participation in online communities, including open source software development, can be an important part of a young person’s education and social life. When trying to strike a balance between freedom and protection, policymakers are not always aware of how their proposals could affect developers or how the open source ecosystem operates.
“Age assurance” refers to a range of approaches used to determine or estimate a user’s age. It is sometimes used interchangeably with “age verification,” which typically refers to higher-confidence methods like photo ID matching or checks against financial or identity systems. Age assurance also includes self-attestation (where users report their age) and age estimation (where age is inferred from signals, facial scanning, or behavior). These approaches span a wide spectrum, with ongoing debate about tradeoffs between accuracy, privacy, security, interoperability, and accessibility. Proposals also vary in what age thresholds trigger restrictions, the services or content covered, how parental consent should factor in, and how access is limited. While we do not discuss each approach in detail here, we encourage readers to engage with the legislation, consider different technical and policy perspectives, and think about how to protect young people online while preserving access to the knowledge, learning opportunities, and creative potential the internet enables—including opportunities to learn to code and participate in the global open source ecosystem.
A poorly designed age assurance law could have significant unintended impacts for open source projects. For example, requirements that operating systems centrally collect and manage user data, or that restrict users from installing software outside of centralized app stores, would conflict with the decentralized, user-controlled norms of the open source ecosystem.
Another potential pitfall is placing age assurance requirements on “publishers” of operating systems, regardless of whether they are individuals or companies. Open source operating systems are frequently iterated on, reused, and redistributed by individual contributors and small communities, many of which have limited resources and small user bases. The diversity of the software ecosystem is worth preserving.
GitHub has engaged with governments on age‑related online safety proposals for several years. In some cases, including Australia’s Social Media Minimum Age legislation, we worked with policymakers to explain why open source code collaboration platforms should not be in scope. Similar exemptions appear elsewhere. France’s current Social Media Minimum Age proposal, for example, includes the same exclusions for open source code collaboration sites and online encyclopedias that appear in the EU Copyright Directive.
Many policymakers recognize that access to the open source software development ecosystem delivers significant public benefits, including education, innovation, and security, and that the risks young people face from participating in open source development communities are materially different by comparison. At the same time, a growing number of laws are seeking to advance child safety goals at varying levels of the tech stack, including through operating systems and application distribution layers. This has raised new questions for developers about how these requirements apply in practice, and whether open source operating systems and developer infrastructure like GitHub could be impacted.
Legislation to know
- California AB 1043 Digital Age Assurance Act and 2026 amending bill AB 1856: Requires operating system providers (in coordination with covered app stores) to collect self‑declared age at account setup and transmit an age‑range signal to applications via a real‑time API.
- Colorado SB 26-051 Age Attestation on Computing Devices: Requires operating systems and covered app stores to generate and share an age‑bracket signal with applications via a real‑time interface, with evolving definitions of “covered application” and “covered application store” shaping scope.
- Illinois HB 4140 Digital Age Assurance Act: Applies to operating system providers and requires collection of age data and transmission of an age‑category signal to developers via a real‑time API, closely mirroring California’s model.
- New York S 8102/A 8893 Device‑Level Age Assurance: Applies broadly to device manufacturers, operating systems, and app stores, requiring “commercially reasonable” age assurance (not just self‑reporting) at device activation and transmission of a verified age signal to apps and websites.
This is just a selection of operating system and app store age assurance legislation in the United States. There have also been related but distinct laws focused on app stores passed in Texas (SB 2420), Louisiana (HB 570), and Utah (SB 142).
These proposals are actively evolving. In Colorado, SB 26‑051 recently had a committee hearing on April 23, 2026, as part of ongoing legislative consideration. The hearing reflected the complexity of balancing child safety, privacy, and feasibility, and included strong engagement from open source developers and organizations. Committee members also signaled that the intent is not to bring open source operating systems or developer infrastructure into scope, and the latest amended text clarified that software installed outside of app stores, including software downloaded from public repositories, is not in scope.
In Brazil, the Digital Statute for Children and Adolescents (Digital ECA), enforceable as of March 2026, applies broadly to digital services “likely to be accessed by children and adolescents” including operating systems, app stores, and platforms, and excludes essential internet functionalities such as open technical protocols and standards. Although the Brazilian National Data Protection Agency (ANPD) has not yet formally clarified whether or how the law applies to free and open source software, its regulatory agenda has initially prioritized “app stores and proprietary operating systems,” and recent draft guidance under public consultation indicates that systems based on collaborative models and free software should not be subject to the same obligations as proprietary services.
Despite this, legal uncertainty has already driven some open source projects to restrict access in Brazil, reflecting concerns about the feasibility of compliance for non-commercial, volunteer-driven ecosystems. While the law was primarily designed for commercial actors, key questions about scope remain unresolved, making it critical for open source developers to actively participate in the ongoing public consultation to ensure implementation reflects decentralized development models and avoids unintended restrictions on access and innovation.
What is an app store, really?
While much of the open source community’s concern has focused on the risk that these proposals could present to open source operating systems, an equally important open question is how “application store” and “application” are defined. As drafted, some definitions of “application store” are broad enough to capture developer infrastructure—such as code collaboration platforms, package managers, and open source indexing services—simply because they allow users to access or download software.
Making software available for download is not the same as operating the kind of centralized, consumer-facing marketplace that most people would understand to be an app store. It is also important to define “application” precisely. Downloading software components like source code, libraries, frameworks, models, and utilities is fundamentally different from offering a standalone executable application through a consumer app marketplace. These components are upstream building blocks, not end‑user products, and the services that host or index them do not control consumer distribution or presentation in the way traditional app stores do.
Recent amendments and discussions across jurisdictions suggest regulators intend to focus on consumer‑facing distribution and services that control end‑user access. Clear statutory distinctions are needed to ensure laws align with that intent. This reflects a familiar challenge in technology policymaking. Frameworks aimed at youth safety and age assurance are typically responding to risks associated with consumer-facing services that collect and monetize user data, distribute content at scale, and rely on engagement-driven systems to shape user behavior.
By contrast, platforms that support collaborative software development serve a fundamentally different function—they are built to help users create, share, and maintain code, not to attract mass audiences, amplify content, or drive passive or excessive consumption—resulting in a materially lower risk profile. Open source communities operating on services like GitHub are organized around shared technical goals and guided by norms of collaboration, reuse, and transparency, further underscoring why these developer-focused services should be distinguished from consumer-facing platforms in regulatory design.
Open source software is also a key driver of economic development and innovation and functions as critical digital infrastructure. Ensuring that policies accurately reflect how open source is built and maintained is essential to preserving these benefits. When policymakers engage directly with open source developers and civil society, they are often able to refine definitions, clarify scope, and better align laws with technical realities.
Uncertainty around compliance requirements can be challenging for open source developers, many of whom contribute on a voluntary basis. At the same time, there are positive examples of policymakers engaging with the open source community to strike a balanced approach. The EU Cyber Resilience Act, for example, was refined through an iterative process to get the balance right for open source. Across U.S. states, these bills continue to evolve and policymakers have shown a willingness to engage with the open source community and consider changes to align with regulatory intent and technical feasibility.
An opportunity for engagement
The window for constructive engagement remains open—and developer voices can make a meaningful difference.
Whether through contacting elected representatives in states considering these proposals like California, Colorado, Illinois, and New York, contributing to Brazil’s Digital ECA public consultation, or engaging with organizations like the Open Source Initiative, or through foundations that steward projects that may be impacted like the FreeBSD Foundation and Debian, there are concrete ways for developers to share their perspectives—helping ensure these policies both support children’s digital safety and reflect technical realities, align with regulatory intent, and avoid unintended harm to the open source ecosystem.
GitHub will continue working with policymakers and the open source community to support balanced approaches that protect young people while preserving open development. We encourage developers to stay informed, connect with open source policy organizations, and reach out to us with questions or concerns. We’ll also continue this conversation with a Maintainer Month livestream on May 22 with panelists from the FreeBSD Foundation and the Open Source Initiative to discuss the broader issues raised by these proposals and how technology policy can be designed with open source in mind.
The post Why age assurance laws matter for developers appeared first on The GitHub Blog.
-
GitHub pour les débutants : débuter avec les contributions OSS
Bienvenue sur GitHub pour les débutants. Jusqu’à présent, nous avons discuté des problèmes et projets GitHub, Actions GitHub, sécurité, Pages GitHub et Markdown. Cette fois, nous allons parler des logiciels open source et de la manière de contribuer à cette communauté. À la fin de cet article, vous saurez ce qu’est l’open source, comment trouver des projets sur lesquels travailler, comment lire un référentiel open source et commencerez à apporter vos premières contributions. Alors commençons !
Comme toujours, si vous préférez regarder la vidéo ou si vous souhaitez y faire référence, nous avons tous nos épisodes GitHub pour débutants disponibles sur YouTube.
Qu’est-ce que l’open source ?
Les logiciels open source (OSS) font référence à des logiciels comportant du code source disponible gratuitement. Contrairement aux « logiciels fermés », les logiciels libres sont accessibles au public et peuvent être utilisés et exploités par tous. Cela signifie que tout le travail, y compris la base de code et la communication entre les utilisateurs, est accessible à tous.
Si vous débutez dans le monde du développement de logiciels, parcourir et contribuer à des projets open source est un excellent moyen de vous plonger dans des projets de grande envergure et à fort impact utilisés par d’innombrables utilisateurs dans le monde entier.
GitHub est la référence en matière de logiciels open source. Voyons donc comment trouver des projets auxquels vous pouvez contribuer.
Comment trouver des projets OSS sur lesquels travailler
Contribuer à un projet de logiciel open source pour la première fois peut être intimidant : nous sommes tous passés par là ! La première étape consiste à rechercher des projets dans une langue que vous connaissez et qui acceptent de nouveaux contributeurs. L’une des façons de procéder est de demander de l’aide à GitHub Copilot Chat.
- Accédez à github.com et sélectionnez l’icône Copilot pour ouvrir une fenêtre de discussion.
- Dans le coin inférieur gauche de la fenêtre de discussion, utilisez la zone de liste déroulante pour sélectionner Demander.
- Saisissez une invite comme celle-ci, mais n’oubliez pas de la mettre à jour pour une langue avec laquelle vous êtes à l’aise.
Je recherche une liste de projets open source écrits en TypeScript qui acceptent de nouveaux contributeurs. Recherchez sur GitHub et réduisez la liste aux référentiels qui utilisent le label bon premier numéro et qui ont plus de 100 étoiles sur GitHub.Copilot effectuera quelques recherches et renverra une liste de projets que vous pourrez explorer, filtrés par le label
bon premier numéro. Ce label indique qu’un numéro est adapté aux débutants et constitue un excellent point de départ pour les nouveaux contributeurs. Ce label est un excellent moyen de trouver des problèmes dans un projet sur lequel vous pouvez travailler.Par exemple, disons que vous souhaitez contribuer au référentiel
vscode.- Accédez au dépôt
vscode. - En haut du dépôt, sélectionnez l’onglet Problèmes.
- Sur la page Problèmes, cliquez sur la case Libellés pour ouvrir le menu déroulant.
- Dans la zone de texte du menu déroulant, commencez à saisir « bon » jusqu’à ce que l’option
bon premier numéros’affiche. - Sélectionnez le libellé
bon premier numéro.
La fenêtre se mettra à jour et affichera une liste des premiers problèmes sur lesquels vous pourrez travailler. Mais avant de vous lancer, vous devriez lire le guide du contributeur dans le référentiel du projet. La plupart des projets open source bien entretenus en auront un.
Comprendre un projet open source
Comme nous venons de le mentionner, la plupart des projets open source ont quelques points communs s’ils sont bien entretenus. Il s’agit des éléments suivants :
- Un fichier README bien documenté avec les instructions d’installation.
- Un guide du contributeur qui explique comment contribuer.
- Une licence Open Source, pour que tout le monde sache que le projet est gratuit.
- Au moins 100 étoiles GitHub pour montrer qu’il est utilisé dans la communauté.
- Développement actif afin que vous sachiez qu’un responsable du code source pourra examiner vos contributions.
- Un label
bon premier numéropour indiquer qu’il est ouvert à de nouveaux contributeurs.
Lorsque vous recherchez un projet auquel contribuer, voici les éléments que vous devriez rechercher dans un référentiel.
💡 Pour plus de documentation sur la recherche d’un bon projet open source, accédez à gh.io/gfb-oss pour en savoir plus sur la recherche de bons premiers problèmes.
Faire une contribution OSS
Regardons maintenant un projet réel et réfléchissons à la manière dont vous soumettriez votre premier numéro. Pour cette démo, jetez un œil au le dépôt
gitfolio. En utilisant les puces ci-dessus, nous voulons voir si ce serait un bon projet sur lequel travailler.- Le projet dispose d’un fichier
READMEbien documenté. - Le projet dispose d’un guide du contributeur :
CONTRIBUTING.md. - Vous pouvez voir la licence open source :
LICENSE. - Il compte plusieurs milliers d’étoiles, soit bien au-delà de notre référence de 100.
- En haut de la liste des fichiers, vous pouvez voir l’archivage le plus récent qui devrait être assez récent. Au moment d’écrire ces lignes, le dernier enregistrement a eu lieu hier, ce qui indique que le projet est activement maintenu.
Sur la base de ces points, tant que vous êtes familier avec TypeScript, il s’agit d’un bon référentiel auquel contribuer. Cependant, vous n’avez pas besoin d’être familier avec TypeScript pour continuer à suivre la démo.
Vous souhaitez maintenant créer un fork du référentiel. Un fork est une copie du référentiel sur laquelle nous pouvons librement expérimenter et apporter des modifications sans affecter le projet d’origine. Nous utilisons généralement des forks pour les contributions open source. Si vous avez besoin d’un rappel sur la création d’un référentiel, consultez ce ancien blog GitHub pour débutants.
- Accédez à la page d’accueil du projet si vous n’y êtes pas déjà.
- En haut du projet, cliquez sur le bouton Fork.
- Dans la nouvelle fenêtre, laissez-vous comme propriétaire et assurez-vous que le « Nom du référentiel » est le même que celui du référentiel d’origine (c’est-à-dire « gitfolio »).
- En bas de la fenêtre, sélectionnez Créer un fork.
- Dans votre copie forkée du référentiel, cliquez sur
README.mddans la liste des fichiers. - Modifiez le fichier en ajoutant du texte.
- En haut à droite, sélectionnez Valider les modifications…
- Assurez-vous de sélectionner l’option en bas pour Créer une nouvelle branche à partir de ce commit et lancez une pull request.
- Sélectionnez Proposer des modifications.
- Dans la fenêtre suivante, cliquez sur le bouton Créer une demande d’extraction. Cela vous permettra de créer une pull request vers le référentiel principal depuis votre branche avec les modifications.
- En haut de la fenêtre “Ouvrir une demande d’extraction”, sélectionnez comparer entre les forks. Cela montrera les modifications de votre fork par rapport au référentiel d’origine.
- Si vous soumettez une modification réelle au référentiel (pas seulement en parcourant une démo), c’est ici que vous donnerez un titre et une description à votre pull request. Vous souhaiterez également fournir un lien vers le problème que vous résolviez dans la description de la demande d’extraction.
À ce stade, vous seriez prêt à soumettre votre pull request en cliquant sur le bouton en bas de la fenêtre. Cependant, une fois que vous faites cela, cela ne devient plus simplement un changement dans votre fork et sera une mise à jour demandée sur le référentiel d’origine. C’est pourquoi cela n’est pas inclus dans les étapes ci-dessus. Lorsque vous soumettez votre demande d’extraction, elle sera disponible et prête à être examinée et, espérons-le, approuvée par un responsable !
Une fois approuvé et fusionné, GitHub applique automatiquement les modifications de votre fork dans la branche principale du référentiel d’origine, la source officielle de vérité pour la base de code.
Quelle est la prochaine étape ?
Félicitations ! Vous avez appris à apporter vos propres contributions aux logiciels open source. J’espère que cela vous inspirera à contribuer à vos projets préférés.
Et si vous recherchez plus d’informations, nous disposons de nombreuses documentations qui peuvent vous aider. Voici quelques liens pour vous faire connaître :
- Trouver des moyens de contribuer à l’open source sur GitHub
- Contribuer à des projets open source
- Contribuer à un projet via le forking
Bon codage !
L’article GitHub pour les débutants : Démarrer avec les contributions OSS est apparu en premier sur Le blog GitHub.