I'm learning PHP5 (last time I checked PHP was in PHP4 days) and I'm glad to see that PHP5 OO is more Java-alike than the PHP4 one but there's still an issue that makes me feel quite unconfortable because of my Java background : ARRAYS.
I'm reading "Proffesional PHP6" (Wrox) and It shows its own Collection implementation.
I've found other clases like the one in http://aheimlich.dreamhosters.com/generic-collections/Collection.phps based on SPL.
I've also found that there's some kind of Collection in SPL (ArrayObject)
However, I'm surprised because I don't really see people using Collections in PHP, they seem to prefer arrays.
So, isn't it a good idea using Collections in PHP just like people use ArrayList instead of basic arrays in Java? After all, php arrays aren't really like java arrays.
Collections in Java make a lot of sense since it's a strongly typed language. It makes sense to have a collection of say "Cars" and another of "Motorbikes".
However, in PHP, due to the dynamically typed nature, it is quite common to sacrifice the formality of Collections. Arrays are sufficient to be used as generic containers of various object types (Cars, Motorbikes, etc.). Also, the added benefit comes from the fact that arrays can be mutated very easily (which sometimes can be a big disadvantage when proper error checking is absent).
I come from a Java background, and I've found that using a Collections design pattern in PHP does not buy much in the way of advantages (no multi-threading, no optimization of memory allocation, no iterators, etc.).
If you're looking for any of those advantages, its probably better to construct a wrapper class around the array, implementing each feature (iterators, etc.) a la carte.
I am very pro collection objects in PHP, they can be used to add type safety, impliment easy to use search, sort and manipulation functionality, and represent the correct OO approach rather then using arrays and the multitude of useful but procedual functions that operate on them in differing patterns all over the source.
We have various collections that we use for various purposes all neatly inherited promoting type safety, consistent coding standards and a high level of code reuse.
But ultimatley, they are all array's internally!
I suppose really it comes down to choice, but in my object oriented world I like to keep easily repeatable segments of code such as sort and search algorithms in base classes, and I find the object notation more self documenting.
PHP arrays are associative... They're far more powerful than Java's arrays, and include much of the functionality of List<> and Map<>.
What do you mean by "good idea"? They're different tools, using one language in the way you used another usually results in frustration.
I, too, was somewhat dismayed to find no Collection type classes in PHP. Arrays have a couple of real disadvantages in my experience.
First, the number of functions available to manipulate them is somewhat limited. For example, I need to be able to arbitrarily insert and remove items to/from a Collection at a given index position. Doing that with the built-in language functions for arrays in PHP is painful at best.
Second, as a sort of offshoot of the first point, writing clean, readable code that manipulates arrays at any level of complexity beyond simple push/pop and iterator stuff is difficult at best. I often find that I have to use one array to index and keep track of another array in data-intensive apps I create.
I prefer working in a framework (my personal choice is NOLOH). There, I have a real Collection class called ArrayList that has functions such as Add, Insert, RemoveAt, RemoveRange and Toggle. I imagine other PHP frameworks address this issue as well.
A nice implementation of collection in php is provided by Varien Lib, this library is part of Magento code with OSL license. ( more info about Magento license and code reuse here.
Cannot find any source code for the library so the best way is to download magento and then look in /lib/Varien/
Yii has implementation of full java like collections stack
http://www.yiiframework.com/doc/api/1.1/CList
I sometimes use this really simple implementation to give me a rough and ready collection.
Normally the main requirement of a collection is enforcing a group of one type of object, you just have to setup a basic class with a constructor to implement it.
class SomeObjectCollection {
/**
* #var SomeObject[]
*/
private $collection = array();
/**
* #param SomeObject $object1
* #param SomeObject $_ [optional]
*/
function __construct(SomeObject $object1 = null, SomeObject $_ = null)
{
foreach (func_get_args() as $index => $arg) {
if(! $arg instanceof SomeObject) throw new \RuntimeException('All arguments must be of type SomeObject');
$this->collection[] = $arg;
}
}
/**
* #return SomeObject[]
*/
public function getAll()
{
return $this->collection;
}
}
Related
Both ArrayLists and Vectors make use of typical arrays internally. However, that leaves me thinking... why would I use ArrayLists when I can technically do the same thing using Arrays? Is convenience the only reason? Do performance-critical applications ever make use of an ArrayList?
Any tips would be appreciated.
I believe there are multiple reasons to prefer Lists over "implementing lists over arrays" or over "using arrays", but here are the two that I think are most important:
Lists have better support to generics than Arrays (you can, and should, read about it in "Effective Java" by Bloch - see Item 25)
If you ask about using ArrayList vs. implementing it yourself - I find it hard to believe that you'll do a better job than the guys that developed it in openjdk (Josh Bloch and Neal Gafter).
Yes, performance critical applications use ArrayList all the time. It's very unlikely that array access is the dominant factor in the vast majority of programs written in Java.
The ArrayList Collection interface is much richer than the functionality provided by built-in primitive arrays. This extra functionality will save you development time as well as debugging time by not having to write those algorithms yourself.
Additionally, many programmers are already familiar with the ArrayList Collection interface and thus by utilizing the existing standard libraries it will make your code easier to read and maintain for the long term.
One reason is that ArrayLists sizes are dynamic, arrays aren't.
The internal implementation of ArrayList is array only. but ArrayList is an wrapper class which is having more capabilities added to it. These capabilities are not available when you deal with Array directly.
For example,
Delete an element from array, you will have to implement logic if your are using an Array. But if you are using ArrayList, it will do the deletion for you.
Adding an element to array:
If you are using an array, you will have to implement the logic. But using an ArrayList, it is pretty easy.
You will find lot of methods in this ArrayList class that are handy for day to day use.
Hope this will help you.
I have some pieces of C++ code that store objects in CObArray. I want to re-code the same pieces in Java using ArrayList to store the same objects. Will there be any difference in the overall efficiency?
So is ArrayList the exact correspondent class for CObArray?
I didn't know what a CObArray was: CObArray is part of MS's C++ implementations.
They description of a CObArray sounds like it behaves in a similar fashion to an ArrayList. That is, in terms of implementation and performance. You should bare in mind that the interfaces will differ for sure. For example, Java's ArrayList does not have anything like GetUpperBound(). If you depend on something like this, you sure ensure you can live without corresponding methods.
In addition, the preferred way to work with ArrayList's in Java is by the use of Generics (specify the type that will exist in the collection at compile-time as opposed to casts performed at run-time). This sounds like it may differ from how it works with CObArray's according to AJG85. You must also ensure that before you begin your conversion to Java that you are aware of differences like this and how they work.
Well, it seems to me ArrayLists make it easier to expand the code later on both because they can grow and because they make using Generics easier. However, for multidimensional arrays, I find the readability of the code is better with standard arrays.
Anyway, are there some guidelines on when to use one or the other? For example, I'm about to return a table from a function (int[][]), but I was wondering if it wouldn't be better to return a List<List<Integer>> or a List<int[]>.
Unless you have a strong reason otherwise, I'd recommend using Lists over arrays.
There are some specific cases where you will want to use an array (e.g. when you are implementing your own data structures, or when you are addressing a very specific performance requirement that you have profiled and identified as a bottleneck) but for general purposes Lists are more convenient and will offer you more flexibility in how you use them.
Where you are able to, I'd also recommend programming to the abstraction (List) rather than the concrete type (ArrayList). Again, this offers you flexibility if you decide to chenge the implementation details in the future.
To address your readability point: if you have a complex structure (e.g. ArrayList of HashMaps of ArrayLists) then consider either encapsulating this complexity within a class and/or creating some very clearly named functions to manipulate the structure.
Choose a data structure implementation and interface based on primary usage:
Random Access: use List for variable type and ArrayList under the hood
Appending: use Collection for variable type and LinkedList under the hood
Loop and process: use Iterable and see the above for use under the hood based on producer code
Use the most abstract interface possible when handing around data. That said don't use Collection when you need random access. List has get(int) which is very useful when random access is needed.
Typed collections like List<String> make up for the syntactical convenience of arrays.
Don't use Arrays unless you have a qualified performance expert analyze and recommend them. Even then you should get a second opinion. Arrays are generally a premature optimization and should be avoided.
Generally speaking you are far better off using an interface rather than a concrete type. The concrete type makes it hard to rework the internals of the function in question. For example if you return int[][] you have to do all of the computation upfront. If you return List> you can lazily do computation during iteration (or even concurrently in the background) if it is beneficial.
The List is more powerful:
You can resize the list after it has been created.
You can create a read-only view onto the data.
It can be easily combined with other collections, like Set or Map.
The array works on a lower level:
Its content can always be changed.
Its length can never be changed.
It uses less memory.
You can have arrays of primitive data types.
I wanted to point out that Lists can hold the wrappers for the primitive data types that would otherwise need to be stored in an array. (ie a class Double that has only one field: a double) The newer versions of Java convert to and from these wrappers implicitly, at least most of the time, so the ability to put primitives in your Lists should not be a consideration for the vast majority of use cases.
For completeness: the only time that I have seen Java fail to implicitly convert from a primitive wrapper was when those wrappers were composed in a higher order structure: It could not convert a Double[] into a double[].
It mostly comes down to flexibility/ease of use versus efficiency. If you don't know how many elements will be needed in advance, or if you need to insert in the middle, ArrayLists are a better choice. They use Arrays under the hood, I believe, so you'll want to consider using the ensureCapacity method for performance. Arrays are preferred if you have a fixed size in advance and won't need inserts, etc.
This question already has answers here:
What does it mean to "program to an interface"?
(33 answers)
Closed 6 years ago.
This is a real beginner question (I'm still learning the Java basics).
I can (sort of) understand why methods would return a List<String> rather than an ArrayList<String>, or why they would accept a List parameter rather than an ArrayList. If it makes no difference to the method (i.e., if no special methods from ArrayList are required), this would make the method more flexible, and easier to use for callers. The same thing goes for other collection types, like Set or Map.
What I don't understand: it appears to be common practice to create local variables like this:
List<String> list = new ArrayList<String>();
While this form is less frequent:
ArrayList<String> list = new ArrayList<String>();
What's the advantage here?
All I can see is a minor disadvantage: a separate "import" line for java.util.List has to be added. Technically, "import java.util.*" could be used, but I don't see that very often either, probably because the "import" lines are added automatically by some IDE.
When you read
List<String> list = new ArrayList<String>();
you get the idea that all you care about is being a List<String> and you put less emphasis on the actual implementation. Also, you restrict yourself to members declared by List<String> and not the particular implementation. You don't care if your data is stored in a linear array or some fancy data structure, as long as it looks like a List<String>.
On the other hand, reading the second line gives you the idea that the code cares about the variable being ArrayList<String>. By writing this, you are implicitly saying (to future readers) that you shouldn't blindly change actual object type because the rest of the code relies on the fact that it is really an ArrayList<String>.
Using the interface allows you to quickly change the underlying implementation of the List/Map/Set/etc.
It's not about saving keystrokes, it's about changing implementation quickly. Ideally, you shouldn't be exposing the underlying specific methods of the implementation and just use the interface required.
I would suggest thinking about this from the other end around. Usually you want a List or a Set or any other Collection type - and you really do not care in your code how exactly this is implemented. Hence your code just works with a List and do whatever it needs to do (also phrased as "always code to interfaces").
When you create the List, you need to decide what actual implementation you want. For most purposes ArrayList is "good enough", but your code really doesn't care. By sticking to using the interface you convey this to the future reader.
For instance I have a habit of having debug code in my main method which dumps the system properties to System.out - it is usually much nicer to have them sorted. The easiest way is to simply let "Map map = new TreeMap(properties);" and THEN iterate through them, as TreeMap returns the keys sorted.
When you learn more about Java, you will also see that interfaces are very helpful in testing and mocking, since you can create objects with behaviour specified at runtime conforming to a given interface. An advanced (but simple) example can be seen at http://www.exampledepot.com/egs/java.lang.reflect/ProxyClass.html
if later you want to change implementation of the list and use for example LinkedList(maybe for better performance) you dont have to change the whole code(and API if its library). if order doesnt matter you should return Collection so later on you can easily change it to Set if you would need items to be sorted.
The best explanation I can come up with (because I don't program in Java as frequently as in other languages) is that it make it easier to change the "back-end" list type while maintaining the same code/interface everything else is relying on. If you declare it as a more specific type first, then later decide you want a different kind... if something happens to use an ArrayList-specific method, that's extra work.
Of course, if you actually need ArrayList-specific behavior, you'd go with the specific variable type instead.
The point is to identify the behavior you want/need and then use the interface that provides that behavior. The is the type for your variable. Then, use the implementation that meets your other needs - efficiency, etc. This is what you create with "new". This duality is one of the major ideas behind OOD. The issue is not particularly significant when you are dealing with local variables, but it rarely hurts to follow good coding practices all the time.
Basically this comes from people who have to run large projects, possibly other reasons - you hear it all the time. Why, I don't actually know. If you have need of an array list, or Hash Map or Hash Set or whatever else I see no point in eliminating methods by casting to an interface.
Let us say for example, recently I learned how to use and implemented HashSet as a principle data structure. Suppose, for whatever reason, I went to work on a team. Would not that person need to know that the data was keyed on hashing approaches rather than being ordered by some basis? The back-end approach noted by Twisol works in C/C++ where you can expose the headers and sell a library thus, if someone knows how to do that in Java I would imagine they would use JNI - at which point is seems simpler to me to use C/C++ where you can expose the headers and build libs using established tools for that purpose.
By the time you can get someone who can install a jar file in the extensions dir it would seem to me that entity could be jus short steps away - I dropped several crypto libs in the extensions directory, that was handy, but I would really like to see a clear, concise basis elucidated. I imagine they do that all the time.
At this point it sounds to me like classic obfuscation, but beware: You have some coding to do before the issue is of consequence.
Besides the dynamic nature of Python (and the syntax), what are some of the major features of the Python language that Java doesn't have, and vice versa?
List comprehensions. I often find myself filtering/mapping lists, and being able to say [line.replace("spam","eggs") for line in open("somefile.txt") if line.startswith("nee")] is really nice.
Functions are first class objects. They can be passed as parameters to other functions, defined inside other function, and have lexical scope. This makes it really easy to say things like people.sort(key=lambda p: p.age) and thus sort a bunch of people on their age without having to define a custom comparator class or something equally verbose.
Everything is an object. Java has basic types which aren't objects, which is why many classes in the standard library define 9 different versions of functions (for boolean, byte, char, double, float, int, long, Object, short). Array.sort is a good example. Autoboxing helps, although it makes things awkward when something turns out to be null.
Properties. Python lets you create classes with read-only fields, lazily-generated fields, as well as fields which are checked upon assignment to make sure they're never 0 or null or whatever you want to guard against, etc.'
Default and keyword arguments. In Java if you want a constructor that can take up to 5 optional arguments, you must define 6 different versions of that constructor. And there's no way at all to say Student(name="Eli", age=25)
Functions can only return 1 thing. In Python you have tuple assignment, so you can say spam, eggs = nee() but in Java you'd need to either resort to mutable out parameters or have a custom class with 2 fields and then have two additional lines of code to extract those fields.
Built-in syntax for lists and dictionaries.
Operator Overloading.
Generally better designed libraries. For example, to parse an XML document in Java, you say
Document doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse("test.xml");
and in Python you say
doc = parse("test.xml")
Anyway, I could go on and on with further examples, but Python is just overall a much more flexible and expressive language. It's also dynamically typed, which I really like, but which comes with some disadvantages.
Java has much better performance than Python and has way better tool support. Sometimes those things matter a lot and Java is the better language than Python for a task; I continue to use Java for some new projects despite liking Python a lot more. But as a language I think Python is superior for most things I find myself needing to accomplish.
I think this pair of articles by Philip J. Eby does a great job discussing the differences between the two languages (mostly about philosophy/mentality rather than specific language features).
Python is Not Java
Java is Not Python, either
One key difference in Python is significant whitespace. This puts a lot of people off - me too for a long time - but once you get going it seems natural and makes much more sense than ;s everywhere.
From a personal perspective, Python has the following benefits over Java:
No Checked Exceptions
Optional Arguments
Much less boilerplate and less verbose generally
Other than those, this page on the Python Wiki is a good place to look with lots of links to interesting articles.
With Jython you can have both. It's only at Python 2.2, but still very useful if you need an embedded interpreter that has access to the Java runtime.
Apart from what Eli Courtwright said:
I find iterators in Python more concise. You can use for i in something, and it works with pretty much everything. Yeah, Java has gotten better since 1.5, but for example you can iterate through a string in python with this same construct.
Introspection: In python you can get at runtime information about an object or a module about its symbols, methods, or even its docstrings. You can also instantiate them dynamically. Java has some of this, but usually in Java it takes half a page of code to get an instance of a class, whereas in Python it is about 3 lines. And as far as I know the docstrings thing is not available in Java