Compiling Java Generics with Wildcards to C++ Templates - java

I am trying to build a Java to C++ trans-compiler (i.e. Java code goes in, semantically "equivalent" (more or less) C++ code comes out).
Not considering garbage collection, the languages are quite familiar, so the overall process works quite well already. One issue, however, are generics which do not exist in C++. Of course, the easiest way would be to perform erasure as done by the java compiler. However, the resulting C++ code should be nice to handle, so it would be good if I would not lose generic type information, i.e., it would be good, if the C++ code would still work with List<X> instead of List. Otherwise, the C++ code would need explicit casting everywhere where such generics are used. This is bug-prone and inconvenient.
So, I am trying to find a way to somehow get a better representation for generics. Of course, templates seem to be a good candidate. Although they are something completely different (metaprogramming vs. compile-time only type enhancement), they could still be useful. As long as no wildcards are used, just compiling a generic class to a template works reasonably well. However, as soon as wildcards come into play, things get really messy.
For example, consider the following java constructor of a list:
class List<T>{
List(Collection<? extends T> c){
this.addAll(c);
}
}
//Usage
Collection<String> c = ...;
List<Object> l = new List<Object>(c);
how to compile this? I had the idea of using chainsaw reinterpret cast between templates. Then, the upper example could be compiled like that:
template<class T>
class List{
List(Collection<T*> c){
this.addAll(c);
}
}
//Usage
Collection<String*> c = ...;
List<Object*> l = new List<Object*>(reinterpret_cast<Collection<Object*>>(c));
however, the question is whether this reinterpret cast produces the expected behaviour. Of course, it is dirty. But will it work? Usually, List<Object*> and List<String*> should have the same memory layout, as their template parameter is only a pointer. But is this guaranteed?
Another solution I thought of would be replacing methods using wildcards by template methods which instanciate each wildcard parameter, i.e., compile the constructor to
template<class T>
class List{
template<class S>
List(Collection<S*> c){
this.addAll(c);
}
}
of course, all other methods involving wildcards, like addAll would then also need template parameters. Another problem with this approach would be handling wildcards in class fields for example. I cannot use templates here.
A third approach would be a hybrid one: A generic class is compiled to a template class (call it T<X>) and an erased class (call it E). The template class T<X> inherits from the erased class E so it is always possible to drop genericity by upcasting to E. Then, all methods containing wildcards would be compiled using the erased type while others could retain the full template type.
What do you think about these methods? Where do you see the dis-/advantages of them?
Do you have any other thoughts of how wildcards could be implemented as clean as possible while keeping as much generic information in the code as possible?

Not considering garbage collection, the languages are quite familiar, so the overall process works quite well already.
No. While the two languages actually look rather similar, they are significantly different as to "how things are done". Such 1:1 trans-compilations as you are attempting will result in terrible, underperforming, and most likely faulty C++ code, especially if you are looking not at a stand-alone application, but at something that might interface with "normal", manually-written C++.
C++ requires a completely different programming style from Java. This begins with not having all types derive from Object, touches on avoiding new unless absolutely necessary (and then restricting it to constructors as much as possible, with the corresponding delete in the destructor - or better yet, follow Potatoswatter's advice below), and doesn't end at "patterns" like making your containers STL-compliant and passing begin- and end-iterators to another container's constructor instead of the whole container. I also didn't see const-correctness or pass-by-reference semantics in your code.
Note how many of the early Java "benchmarks" claimed that Java was faster than C++, because Java evangelists took Java code and translated it to C++ 1:1, just like you are planning to do. There is nothing to be won by such transcompilation.

An approach you haven't discussed is to handle generic wildcards with a wrapper class template. So, when you see Collection<? extends T>, you replace it with an instantiation of your template that exposes a read-only[*] interface like Collection<T> but wraps an instance of Collection<?>. Then you do your type erasure in this wrapper (and others like it), which means the resulting C++ is reasonably nice to handle.
Your chainsaw reinterpret_cast is not guaranteed to work. For instance if there's multiple inheritance in String, then it's not even possible in general to type-pun a String* as an Object*, because the conversion from String* to Object* might involve applying an offset to the address (more than that, with virtual base classes)[**]. I expect you'll use multiple inheritance in your C++-from-Java code, for interfaces. OK, so they'll have no data members, but they will have virtual functions, and C++ makes no special allowance for what you want. I think with standard-layout classes you could probably reinterpret the pointers themselves, but (a) that's too strong a condition for you, and (b) it still doesn't mean you can reinterpret the collection.
[*] Or whatever. I forget the details of how the wildcards work in Java, but whatever's supposed to happen when you try to add a T to a List<? extends T>, and the T turns out not to be an instance of ?, do that :-) The tricky part is auto-generating the wrapper for any given generic class or interface.
[**] And because strict aliasing forbids it.

If the goal is to represent Java semantics in C++, then do so in the most direct way. Do not use reinterpret_cast as its purpose is to defeat the native semantics of C++. (And doing so between high-level types almost always results in a program that is allowed to crash.)
You should be using reference counting, or a similar mechanism such as a custom garbage collector (although that sounds unlikely under the circumstances). So these objects will all go to the heap anyway.
Put the generic List object on the heap, and use a separate class to access that as a List<String> or whatever. This way, the persistent object has the generic type that can handle any ill-formed means of accessing it that Java can express. The accessor class contains just a pointer, which you already have for reference counting (i.e. it subclasses the "native" reference, not an Object for the heap), and exposes the appropriately downcasted interface. You might even be able to generate the template for the accessor using the generics source code. If you really want to try.

Related

Can a language ever have compile-time checking but the characteristics of dynamic typing?

Upon reading the following:
A lot of people define static typing and dynamic typing with respect
to the point at which the variable types are checked. Using this
analogy, static typed languages are those in which type checking is
done at compile-time, whereas dynamic typed languages are those in
which type checking is done at run-time.
This analogy leads to the analogy we used above to define static and
dynamic typing. I believe it is simpler to understand static and
dynamic typing in terms of the need for the explicit declaration of
variables, rather than as compile-time and run-time type checking.
Source
I was thinking that the two ways we define static and dynamic typing: compile-time checking and explicit type declaration are a bit like apples and oranges. A characteristic in all statically typed languages (from my knowledge) is the reference variables have a defined type. Can there be a language that has the benefits of compile-time checking (like Java) but also the ability to have variables unbounded to a specific type (like Python)?
Note: Not exactly type inference in a language like Java, because the variables are still assigned a type, just implicitly. This theoretical language wouldn't have reference types, so there would be no casting. I'm trying to avoid the use of "static typing" vs "dynamic typing" because of the confusion.
There could be, but should there be?
Imagine in hypothetical-pseudo-C++:
class Object
{
public:
virtual Object invoke(const char *name, std::list<Object> args);
virtual Object get_attr(const char *name);
virtual const Object &set_attr(const char *name, const Object &src);
};
And that you have a language that arranges:
to make Object class the root base class of all classes
syntactic sugar to turn blah.frabjugate() into blah.invoke("frabjugate") and
blah.x = 10 into blah.set_attr("x", 10)
Add to this something combining attributes of boost::variant and boost::any and you have a pretty good start. All the dynamicism (both good and runtime bugs bad) of Python with the eloquence and rigidity (yay!) of C++ or Java. With added run-time bloat and efficiency of hash-table lookups vs. call/jmp machine instructions.
In languages like Python, when you call blah.do_it() it has to do potentially multiple hash table lookups of the string "do_it" to find out if your instance blah or its class has a callable thing called "doit" every time it is called. This is the most extreme late-binding that could be imaged:
flarg.do_it() # replaces flarg.do_it()
flarg.do_it() # calls a different flarg.do_it()
You could have your hypothetical language give some control over when the binding occurs. C++-like standard methods are crudely static bound to the apparent reference type, not the real instance type. C++ virtual methods are late-bound to the object instance type. Python-like attributes and methods are extremely late bound to the current version of the object instance.
I think you could definitely program in a strong static typed language in a dynamic style, just as you could build an interpreter in a language like C++ or Java. Some syntax hooks could make it look a little more seamless. But maybe you could do the same in reverse: maybe a Python decorator that automatically checks argument types, or a MetaClass that does it at compile time? [no, I don't think this is possible...]
I think you should view it as a union of features. but you'd get both the best and the worst of both worlds...
Can there be a language that has the benefits of compile-time checking (like Java) but also the ability to have variables unbounded to a specific type (like Python)?
Actually mostly language have support for both, so yes. The difference is which form is preferred/easier and generally used. Java prefers static types but also supports dynamic casts and reflection.
This theoretical language wouldn't have reference types, so there would be no casting.
You have to consider that language also need to perform reasonably well so you have to consider how they will be implemented. You could have a super type but this makes optimisation very hard and you code will most likely either run slowly or use much more resources.
The more popular languages tend to make pragmatic implementation choices. They are not purely one type or another and are willing to borrow styles even if they don't handle them as cleanly as a "pure" language.
what exactly do they allow the compiler or programmer to do that dynamic types can't?
It is generally accepted that the quicker you find a bug, the cheaper it is to fix. When you first start programming, the cost of maintenance isn't high in your mind, but once you have much more experience you will realise that a successful project costs far more to maintain than it did to develop and fixing long standing bugs can be really costly.
static languages have two advantages
you pick up bugs sooner rather than later. The sooner the better. With dynamic languages you might never discover a bug if the code is never run.
the cost of maintenance is easier. Static languages make clearer the assumption made when the code was first written and are more likely to detect issues if you don't have enough test coverage (btw, you never have enough test coverage)
No you cannot. The difference here boils down to early binding versus late binding. Early binding means matching everything up on the binary level upfront, fixing it in code. The result is rigid, type-safe and fast code. Late binding means there is some kind of runtime interpretation involved. This results in flexiblility (potentially unsafe) at the cost of performance.
The two approaches are different on a technical level (compilation versus interpretation) and the programmer would have to choose which is desired when, which would defeat the benefit of having both in the first place.
In languages that use a (common) language runtime however you do get some of what you are asking for through reflection. But it is organized differently and still type-safe. It is not the implicit kind of binding you refer to but requires a bit of work and awareness from the programmer.
As far as what is possible with static types that is impossible with dynamic types: nothing. They are both Turing complete
The value of static types is finding bugs early. In Python, something as simple as a misspelled name isn't caught until you run the program, and even then only if the line of code with the misspelling is run.
class NuclearReactor():
def turn_power_off(self):
...
def shut_down_cleanly(self):
self.turn_power_of()

Minimizing interfaces in Golang

In golang, interfaces are extremely important for decoupling and composing code, and thus, an advanced go program might easily define 1000s of interfaces .
How do we evolve these interfaces over time, to ensure that they remain minimal?
Are there commonly used go tools which check for unused functions ?
Are there best practices for annotating go functions with something similar to java's #Override, which ensures that a declared function is properly implementing a expected contract?
Typically in the java language, it is easy to keep code tightly bound to an interface specification because the advanced tooling allows us to find and remove functions which aren't referenced at all (usually this is highlighted automatically for you in any common IDE).
Are there commonly used go tools which check for unused functions ?
Sort of, but it is really hard to be sure for exported interfaces. oracle can be used to find references to types or methods, but only if you have all of the code that references you availible on your gopath.
can you ensure a type implements a contract?
If you attempt to use a type as an interface, the compiler will complain if it does not have all of the methods. I generally do this by exporting interfaces but not implementations, and making a constructor:
type MyInterface interface{
Foo()
}
type impl struct{}
func (i *impl) Foo(){}
func NewImpl() MyInterface{
return &impl{}
}
This will not compile if impl does not implement all of the required functions.
In go, it is not needed to declare that you implement an interface. This allows you to implement an interface without even referencing the package it is defined in. This is pretty much exactly the opposite of "tightly binding to an interface specification", but it does allow for some interesting usage patterns.
What your asking for isn't really a part of Go. There are no best practices for annotating that a function satisfies an interface. I would personally say the only clear best practice is to document which interfaces your types implement so that people can know. If you want to test explicitly (at compile time) if a type implements an interface you can do so using assignment, check out my answer here on the topic; How to check if an object has a particular method?
If you're just looking to take inventory of your code base to do some clean up I would recommend using that assignment method for all your types to generate compile time errors regarding what they don't implement, scale down the declarations until it compiles. In doing so you should become aware of the disparity between what might be implemented and what actually is.
Go is also lacking in IDE options. As a result some of those friendly features like "find all references" aren't there. You can use text searching tricks to get around this, like searching func TheName to get only the declaration and .TheName( to get all invocations. I'm sure you'll get used to it pretty quickly if you continue to use this tooling.

java's typing system: prefer interface types to class types as method parameters/return values

I just making an effort to understand the power of the interfaces and how to use them to the best advantage.
So far, I understood that interfaces:
enable us to have another layer of abstraction, separate the what (defined by the interface) and the how (any valid implementation).
Given just one single implementation I would just build a house (in one particular way) and say here, its done instead of coming round with a building plan (the interface) and ask you, other developers to build it as i expect.
So far, so good.
What still puzzles me is why to favor interface types over class types when it comes to method parameters and return values. Why is that so? What are the benefits (drawbacks of the class approach)?
What interests me the most is how this actually translates into code.
Say we have a sort of pseudo mathInterface
public interface pseudoMathInterface {
double getValue();
double getSquareRoot();
List<Double> getFirstHundredPrimes();
}
//...
public class mathImp implements pseudoMathInterface { }
//.. actual implementation
So in the case of getPrimes() method I would bound it to List, meaning any concrete implementation of the List interface rather than a concerete implementation such as ArrayList!?
And in terms of the method parameter would I once again broaden my opportunities whilst ensuring that i can do with the type whatever i would like to do given it is part of the interface's contract which the type finally implements.!?
Say you are the creator of a Maven dependency, a JAR with a well-known, well-specified API.
If your method requests an ArrayList<Thing>, treating it is a collection of Things, but all I have got is a HashSet<Thing>, your method will twist my arm into copying everything into an ArrayList for no benefit;
if your method declares to return an ArrayList<Thing>, which (semantically) contains just a collection of Things and the index of an element within it carries no meaning, then you are forever binding yourself to returning an actual ArrayList, even though e.g. the future course of the project makes it obvious that a custom collection implementation, specifically tailored to the optimization of the typical use case of this method, is desperately needed to improve a key performance bottleneck.
You are forced to make an API breaking change, again for no benefit to your client, but just to fix an internal issue. In the meantime you've got people writing code which assumes an ArrayList, such as iterating through it by index (there is an extremely slight performance gain to do so, but there are early optimizers out there to whom that's plenty).
I propose you judiciously generalize from the above two statements into general principles which capture the "why" of your question.
An important reason to prefer interfaces for formal argument types is that it does not bind you to a particular class hierarchy. Java supports only single inheritance of implementation (class inheritance), but it supports unlimited inheritance of interface (implements).
Return types are a different question. A good rule of thumb is to prefer the most general possible argument types, and the most specific possible return types. The "most general possible" is pretty easy, and it clearly lines up with preferring interface types for formal arguments. The "most specific possible" return types is trickier, however, because it depends on just what you mean by "possible".
One reason for using interface types as your methods' declared return types is to allow you to return instances of non-public classes. Another is to preserve the flexibility to change what specific type you return without breaking dependent code. Yet another is to allow different implementations to return different types. That's just off the top of my head.
So in the case of getPrimes() method I would bound it to List, meaning any concrete implementation of the List interface rather than a concerete implementation such as ArrayList!?
Yes, this allows the method to later then change what List type it returns without breaking client code that uses the method.
Besides having the ability to change what object is really passed to/returned from a method without breaking code, sometimes it may be better to use an interface type as a parameter/return type to lower the visibility of fields and methods available. This would reduce overall complexity of the code that then uses that interface type object.

Using Generics in a non collection

Saw a couple of similar questions today- got me thinking:
What are the rules for when to use generics?
When a collection is involved?
When there are getter methods which return collection elements?
Whether the object changes type during its lifetime?
Whether the relationship is composition/aggregation to the class?
There doesn't seem to be a consensus on the questions you should ask yourself in order to determine whether you should use generics. Is it purely an opinionated decision?
Is it easier to ask when you shouldn't use generics??
Let me start with some general points about generics and type information before I get back to the first point on your bullet list.
Generics prevent unnecessary type casts.
Do you remember Java before generics were introduced? Type casts were used everywhere.
This is what type casts essentially are: You are telling the compiler about an object's type, because the compiler doesn't know or cannot infer it.
The problem with type casts is that you sometimes make mistakes. You can suggest to the compiler that an instance of class Fiddle is a Frobble ((Frobble)fiddle), and the compiler will happily believe you and compile your source code. But if it turns out that you were wrong, you'll much later get a nice run-time error.
Generics are a different, and usually safer way of letting the compiler retain type information. Basically, the compiler is less likely to make typing mistakes than a human programmer... the less type casts required, the fewer potential error sources! Once you've established that a list can only contain Fiddle objects (List<Fiddle>), the compiler will keep this information and prevent you from having to type-cast each item in that list to some type. (You still could cast a list item to Frobble, but why should you, now that the compiler let's you know that the item is a Fiddle!?)
I have found that generics greatly reduce the need for type casting, so the presence of lots of type casts — especially when you always cast to the same type — might be an indicator that generics should be used instead.
Letting the compiler keep as much type information as possible is a good thing because typing errors can be discovered earlier (at compile-time instead of at run-time).
Generics as a replacement for the "generic" java.lang.Object type:
Before generics, if you wanted to write a method that worked on any type, you employed the java.lang.Object supertype, because every class derives from it.
Generics allow you to also write methods that work for any type, but without forcing you or the compiler to throw away known type information — that's exactly what happens when you cast an object to the Object type. So, frequent use of the Object type might be another indicator that generics might be appropriate.
When a collection is involved?
Why do generics seem an especially good fit for collections? Because, according to the above reasoning, collections are rarely allowed to contain just any kind of object. If that were so, then the Object type would be appropriate because it doesn't put any restrictions whatsoever on the collection. Usually however, you expect all items in a collection to be (at least) a Frobble (or some other type), and it helps if you let the compiler know. Generics are the way how to do just that.
Whether the relationship is composition/aggregation to the class?
You've linked to another question that asks about a class Person having a car property should be made generic as class Person<T extends ICar>.
In that case, it depends whether your program needs to distinguish between Honda people and Opel people. By making such a Person class generic, you essentially introduce the possibility of different kinds of people. If this actually solves a problem in your code, then go for it. If, however, it only introduces hurdles and difficulties, then resist the urge and stay with your non-generic Person class.
Side node: Keep in mind that you don't have to make a whole class generic; you can make only a few specific methods generic. At least in the .NET ecosystem, it is recommended to keep generics as "local" as possible, i.e. don't turn a class into a generic one when it's sufficient to make only a method generic.
I find myself using generics when the following three criteria are met:
I note that I am repeating code, and start thinking of how to refactor it into a new method/class.
The class/method I am rewriting doesn't really care about what the concrete type of one of the arguments is, only that it follows a certain contract (eg <T extends Bar>).
The return type of the method/one of the methods is related to said parameter or
two or more parameters are related and need to have the same type, although I don't really care what that type is.
Usually when these criteria are met, there is a Collection of some kind involved, but not necessarily.
In my opinion the second statement (when not to use them) is correct.
When not to use generics: when the strong typing is too restrictive (typically generics in generics). In some cases you want to ensure loose coupling among your components and the point is "send me what you want, the API will somehow handle it", than you will employ some kind of visitor, rather than specifying complete concrete API using some generic type.
When you should: if you had not, you would have to cast the variable to some type (you even you might have to guess or use instanceof)...
Just one sidenote: every structured type is some kind of collection...
Is it easier to ask when you shouldn't use generics??
To answer this question, one of the major problems with Generics is its treatment for Checked Exceptions. Here is a write-up from Geotz about this.
Reason why you should consider generics, again there is a cache of information shared.

C++ and Java : Use of virtual base class

I have some doubts while comparing C++ and Java multiple inheritance.
Even Java uses multiple, multi-level inheritance through interfaces - but why doesnt it use anything like a virtual base class as in C++ ? Is it because the members of a java interface are being ensured one copy in memory (they are public static final), and the methods are only declared and not defined ?
Apart from saving memory, is there any other use of virtual classes in C++ ? Are there any caveats if I forget to use this feature in my multiple inheritance programs ?
This one is a bit philosophical - but why didnt the C++ developers made it a default to make every base class, virtual ? What was the need of providing flexibility ?
Examples will be appreciated. Thanks !!
1) Java interfaces dont have attributes. One reason for virtual base classes in c++ is to prevent duplicate attributes and all the difficulties associated with that.
2) There is at least a slight performance penalty for using virtual base classes in c++. Also, the constructors become so complicated, that it is advised that virtual base classes only have no-argument constructors.
3) Exactly because of the c++ philosphy: One should not require a penalty for something which one may not need.
Sorry - not a Java programmer, so short on details. Still, virtual bases are a refinement of multiple inheritance, which Java designers always defended ommiting on the basis that it's overly complicated and arguably error-prone.
virtual bases aren't just for saving memory - the data is shared by the different objects inheriting from them, so those derived types could use it to coordinate their behaviour in some way. They're not useful all that often, but as an example: object identifiers where you want one id per most-derived object, and not to count all the subobjects. Another example: ensuring that a multiply-derived type can unambiguously map / be converted to a pointer-to-base, keeping it easy to use in functions operating on the base type, or to store in containers of Base*.
As C++ is currently Standardised, a type deriving from two classes can typically expect them to operate independently and as objects of that type tend to do when created on the stack or heap. If everything was virtual, suddenly that independence becomes highly dependent on the types from which they happen to be derived - all sorts of interactions become the default, and derivation itself becomes less useful. So, your question is why not make the default virtual - well, because it's the less intuitive, more dangerous and error-prone of the two modes.
1.Java multiple inheritance in interfaces behaves most like virtual inheritance in C++.
More precisely, to implement java-like inheritance model in c++ you need to use c++ virtual base classes.
However, one of the disadvantages of c++ virtual inheriritance (except of small memory and performance penalty) is the impossibility to static_cast<> from base to derived, so rtti (dynamic_cast) need to be used
(or one may provide "hand made" virtual casting functions for child classes if a list of
such child classes are known in advance)
2.if you forget "virtual" qualifier in inheritance list, it usually lead to compiler error
since any casting frome drived to base class becomes ambigious
3.Philosophical questions usually are quite dificult to answer... c++ is a multiparadigm (and multiphilosophical) language and doesn't impose any philosophical decisions. You may use virtual inheritance whenever possible in you own projects, and (you are rioght) it has a good reason. But such a maxima may be unacceptable for others, so universal c++ tools (standard and other widely used libraries) should be (if possible) free of any particular philosophical conventions.
I'm working on an open source project which basically is translating a large C++ library to Java. The object model of the original creature in C++ can be pretty complicated sometimes. More than necessary, I'd say... which was more or less the motto of Java designers... well... this is another subject.
The point is that I've written an article which shows how you can circumvent type erasure in Java. The article explains well how it can be done and, in the end how your source code can eventually resemble C++ very closely.
http://www.jquantlib.org/index.php/Using_TypeTokens_to_retrieve_generic_parameters
An immediate implication of the study I've done is that it would be possible to implement virtual base classes in your application, I mean: not in Java, not in the language, but in your application, via some tricks, or a lot of tricks to be more precise.
In case you do have interest for such kind of black magic, the lines below may be useful for you somehow. Otherwise certainly not.
Ok. Let's go ahead.
There are several difficulties in Java:
1. Type erasure (solved in the article)
2. javac was not designed to understand what a virtual base class would be;
3. Even using tricks you will not be able to circumvent difficulty #2, because this difficulty appears at compilation time.
If you'd like to use virtual base classes, you can have it with Scala, which basically solved difficulty #2 by exactly creating another compiler, which fully understands some more sophisticated object models, I'd say.
if you'd like to explore my article and try to "circunvent" virtual base classes in pure Java (not Scala), you could do something like I explain below:
Suppose that you have something like this in C++:
template<Base>
public class Extended : Base { ... }
It could be translate to something like this in Java:
public interface Virtual<T> { ... }
public class Extended<B> implements Virtual<B> { ... }
OK. What happens when you instantiate Extended like below?
Extended extended = new Extended<Base>() { /* required anonymous block here */ }
Well.. basically you will be able to get rid of type erasure and will be able to Obtain type information of Base inside your class Extended. See my article for a comprehensive explanation of the black magic.
OK. Once you have type of Base inside Extended, you can instantiate a concrete implementation of Virtual.
Notice that, at compile time, javac can verify types for you, like in the example below:
public interface Virtual<Base> {
public List<Base> getList();
}
public class Extended<Base> implements Virtual<Base> {
#Override
public List<Base> getList() {
// TODO Auto-generated method stub
return null;
}
}
Well... despite all effort to implement it, in the end we are doing badly what an excellent compiler like scalac does much better than us, in particular it is doing its job at compile time.
I hope it helps... if not confused you already!

Categories