All of Taran's Comments + Replies

Taran00

Thinking about it more there's another, more serious restriction, at least for now: Codex can't write code that depends on the rest of your codebase.  Consider the following lightly-anonymized code from a real-world codebase I contribute to:

def unobserved(self) -> Set[str]:

    """Returns a set of all unobserved random variable names inside this Distribution -- that is,

those that are neither observed nor marginalized over.

    """

    return set(self.components) - self.observed - self.marginalized

I don't think C... (read more)

2Michaël Trazzi
I created a class initializing the attributes you mentioned, and when adding your docstring to your function signature it gave me exactly the answer you were looking for. Note that it was all in first try, and that I did not think at all about the initialization for components, marginalized or observed—I simply auto-completed. class Distribution: def __init__(self): self.components = [] self.marginalized = None self.observed = None def unobserved(self) -> Set[str]: """Returns a set of all unobserved random variable names inside this Distribution -- that is, those that are neither observed nor marginalized over. """ return set(self.components) - set(self.observed) - set(self.marginalized)
Taran70

What's your take on the licensing issue?  I know for sure that Codex won't affect my day job in the near term, because I'm not allowed to use it; I guess most large companies, and open-source projects large enough to care about IP assignment, will have the same problem.

This indirectly influences the speed of model improvement we should expect: the market for this kind of tool is smaller than you might think from the size of github's userbase, so there'll be less incentive to invest in it.

4Michaël Trazzi
Wait, they did plain forbid you to use at all during work time, or they forbid to use its outputs for IT issues? Surely, using Codex for inspiration, given a natural language prompt and looking at what function it calls does not seem to infringe any copyright rules? * 1) If you start with your own variable names, it would auto-complete with those, maybe using something he learned online. would that count as plagiarism in your sense? How would that differ from copy-pasting from stack overflow changing the variable names (I'm not an expert in SO copyright terms but you should probably quote SO if doing so and there might be some rules about distributing it commercially). * 2) imagine you are using line-by-line auto-complete, and sometimes you re-arrange the ordering of the lines, adding your own code, even modifying it a bit. At one point does it become your own code? * 3) In the cases 1. and 2. that I mentioned above, even if some of the outputs were verbatim (which apparently happens a tiny fraction of the time) and had exactly the same (probably conventional) variable names, would "I have some line of code with exact the same normal naming of variables on the internet" be enough for going to court? * 4) Assuming that developers are, or will be, more productive using such tools, don't you think they would still use Copilot-like software to a) get inspiration b) copy-paste code that they would later modify to bypass IP infringements if they are smart enough about it, even though their companies "forbids" them from using it?