Java modifiers
OpenPublic

Description

Users would like to be able to know what modifiers (public, private, volatile, etc.) have been applied to each field/method/class.

schroederc created this task.Via WebMay 3 2016, 9:02 AM
schroederc added subscribers: schroederc, zarko.
schroederc added a project: Restricted Project.
misha added a subscriber: misha.Via WebMay 3 2016, 1:52 PM
schroederc changed the visibility of this Maniphest Task from "All Users" to "Public (No Login Required)".Via WebMay 16 2016, 3:27 PM
misha added a comment.Via WebDec 21 2016, 7:17 PM

I'm trying to implement this now, but ran into a hurdle.

I decided that it makes more sense to put access modifiers on childof edges instead of nodes. My reasoning a method being public or private (or static, etc) doesn't really mean anything without the context of the class. The same reasoning applies to classes in packages (or in other classes).

Indeed, I can imagine a language where functions are first-class objects where the same function appears as private in one class, but public in another. Although admittedly I have no idea whether such a language exists.

Does that make sense? Do you guys think access modifiers belong on edges instead of nodes?

When I started trying to implement this change, I ran into the following problem:

Classes in default packages do not have incoming childof edges, so there's nowhere to put the access modifier for the class.

What do you think? If access modifiers belong on the childof edges should we add a childof edge for classes in default packages?

Thanks!

fromberger added a subscriber: fromberger.Via WebJan 4 2017, 3:31 PM

As a rule we don't want to attach metadata to edges, as in our experience it greatly complicates the use of the graph.

Given that no method (in the Java sense) can be properly understood without reference to its class, could modifiers not be properties of the function node representing the method? Is there any information that would be lost by doing that, given that the modifiers are a property of the methods rather than their relationship?

misha added a comment.Via WebJan 6 2017, 4:30 PM

Cool. I took a stab at implementing this in D1232. Could someone please take a look? As I wrote in the description, I haven't updated the majority of the java tests yet, because I'm waiting to see what your team thinks of the change in general.

fromberger added a comment.Via WebJan 13 2017, 8:22 PM

I don't think this issue is ready for a code patch yet. The proposal, I think, is to add modifiers to some nodes based on the Java-specific modifiers. It's easy enough to add text to the graph, but if we're going to make the indexers emit them, we need to have some language for them in the schema. I think there are a couple interesting questions we should answer first:

  1. Are these modifiers general enough that they should be supported for all languages, or are they specific to Java?
  2. How should properties like this be represented? D1232 advances the view that name="public" value="" is the right model; another option would be name="visibility"​ value="public", for example. There may be others.

Regarding (1): There are actually several questions here. Consider the static modifier: That label is used by several languages, but the meaning isn't the same. In C and C++, for example, "static" means several different things depending on where it's used (in a local variable declaration, it means the variable has global extent; in a class member it means the member is not bound to an instance; in a free function it means the function has static linkage; etc.).

Visibility and access modifiers like "public", "private", and "protected" are likewise a little different across languages. Similarly, there is a terminological issue with keywords like "final" and "const", which have different but overlapping meanings. Should languages that use conventions other than keywords (e.g., Python, Go) also use the same modifiers?

If tags are separated by language, the labels should probably reflect that, e.g., "<language-name>/mod/<modifier>". If they're universal, we should choose names that work across languages.

Regarding (2): Having one fact per modifier is easy for the indexer, but arguably trickier for tools that consume the data. How does the tool know which facts to ask for, and which ones should be present?

Offhand, I'm somewhat inclined to prefer a model like "visibility=public" to "public=true". Modifiers probably ought to be their own category, too, e.g., "/kythe/mod/visibility", so that tools can glob over them without accidentally picking up other stuff. But a reasonable case could be made either way.

Are there some specific use cases you can talk about? That is, do you have any examples of what a tool will do with these data? That might help focus the discussion a bit.

misha added a comment.Via WebJan 19 2017, 5:39 PM

Sure, there are a couple things we'd like to use this data for.

We're trying to index APIs so that we can determine API compatibility simply by looking in Kythe.

In the short term, we'd like to generate a TAGS file based on the contents of Kythe's index.

fromberger added a comment.Via WebJan 22 2017, 3:06 PM

We're trying to index APIs so that we can determine API compatibility simply by looking in Kythe.

Is there a design somewhere for how "compatibility" would be determined? Is it meant to be a function-signature comparison of some kind? I ask because it's not at all clear to me how the modeling of visibility modifiers plays into that. Perhaps more importantly, this touches on my question (1) above, about whether this is specific to Java.

[W]e'd like to generate a TAGS file based on the contents of Kythe's index.

Are modifiers important for that case? Assuming you mean CTAGS, I don't think the data format exposes visibility.

misha added a comment.Via WebJan 23 2017, 2:17 PM

We're trying to index APIs so that we can determine API compatibility simply by looking in Kythe.

Is there a design somewhere for how "compatibility" would be determined? Is it meant to be a function-signature comparison of some kind? I ask because it's not at all clear to me how the modeling of visibility modifiers plays into that. Perhaps more importantly, this touches on my question (1) above, about whether this is specific to Java.

Here's an example to illustrate how of how we'd use the visibility information to determine API compatibility:

Let's say I'm using the io.widgetcorp.WidgetMaker API.
In version 2.0 Widget Corp adds a protected magicWidget() method to the WidgetMaker class.
In version 3.0 they change the visibility of WidgetMaker.magicWidget() to public.

If I have code that calls magicWidget() from a WidgetMaker subclass, then that code is compatible with versions 2.0 and 3.0.

If I have code which calls magicWidget() from a class that is not a subclass of WidgetMaker, then my code is only compatible with version 3.0.

This is not java specific: similar considerations apply to all languages with access modifiers (e.g. C++, scala, etc)

[W]e'd like to generate a TAGS file based on the contents of Kythe's index.

Are modifiers important for that case? Assuming you mean CTAGS, I don't think the data format exposes visibility.

I've been using etags, which does expose visibility.

Add Comment