replacement project for existing school assignment - java

I have a school assignment which consists of programming a scanner/lexical analyzer for a specified simple language. The scanner has to be programmed in C++.
This type of assignment has been used since the 90's and, although still a valid excersise, I consider it to be a little antiquated and a little boring.
I have gotten permission to come up with a new programming assignment.
It has to be of equal difficulty and it can be in C++, Objective C or Java.
What direction should I go that has the same level of difficulty but is a little bit more modern and applicable to modern CS/life.
Thanks

This type of assignment... is considered to be a little antiquated and a little boring.
I'm curious: who considers this antiquated? Your professor? Somebody notable in the parsing community? Or you?
Scanners and parsers are still relevant to professional software development and, more importantly, relevant to the science of computation. If you wish to understand computers, then you should understand scanners and parsers.
Still, if you are convinced that you should do some other assignment, why not write a tool to generate a scanner in C++? You could supply, as input, a set of regular expressions that define the tokens of the grammar, and it would produce a C++ program that would recognize the input tokens. Then, you will never need to write a scanner ever again!

Why do you think that Lexers / Parsers are not relevant anymore? I find that I write something along those lines at least once a year.

As a software engineer, I would say whatever code you write during the CS courses would be the best ones that you may probably write in your life. Once you come into the industry, you will probably write only modules and not the entire thing.
Believe me. Once you come into the industry and has spend some time here, you will just want to write those compilers, assemblers, lexical analyzers. So please don't miss the chance.
I believe the challenges in writing this "boring" stuffs are just worth it and you will find them truly interesting once you start designing the stuff.

Writing a scanner/lexical analyzer was one of my favorite assignments. I would argue that it was also one of the most relevant. It is a real world problem.
My guess is that it does not feel modern because of the simple programming language you are scanning. I personally would change out the simple programming language for something like Markdown or Textile. Both of these are used in modern programming, and will teach you similar concepts.

Related

Approach to learning algorithms using a specific language

So for the summer I decided that I may as well start learning algorithms before school starts. I've been told that the class is fairly fast paced, and that algorithms isn't something you should take lightly (I have a tendency to do this with all the course work during the semester lol).
The book we're going to use is this Algorithms (4th Edition).
Anyway, this is my problem.
I'm almost third way through the book, but I just realized what I was doing. For example, I would read and re-read the sections I don't quite understand. Then if I feel confident enough, I would try to reproduce the same algorithm in java from my head. But by doing this, my code looks almost exactly like the ones in the book..in java.
I can't say I'm just memorizing code after code--I do understand the concepts and they help me code these algorithms--but I feel like I'll only be able to implement these algorithms in java. I should note that I only know java at the moment.
tldr: I'm learning algorithms as if I'm learning to play the guitar--repetition after repetition. But by doing so I feel like I'm being more fixated that I'll only able to implement these in java. How exactly would you learn algorithms if the book you're using is language-specific?
Thanks in advance.
Don't Confuse Yourself
You're studying Java, so write them in Java. Especially if Java is your first language. Don't confuse yourself for now, as you are trying to learn 2 things at once: how to progam in Java, and how to progam. You're learning both a new language and a way of thinking. Don't do too much but adding another language to the sauce for now.
Diversify
Later on, or if you feel confident enough that you can take on another language simultaneously, then it would obviously be beneficial to learn another one and try to replicate the algorithms without looking at the book.
Reproduce and Extend
What we could recommend you is to look for derivates of the algorithms. Known variants, that have been documented, and where you could just read the description of the variant so you can try to implement it from the "base" version, without needing to read the book.
For instance, if your book introduced you to a linked list, you should be able to come up with the algorithm for a doubly-linked list or a circular linked list without reading more than a description of the desired outcome. Or there's something about the original concepts that you clearly misunderstood.
Try First, Read-On Later
I'd recommend you actually even try to implement the algorithms described in your book before they show them to you. The point of seeing Sedgewick's algorithm is to see a canonical implementation, which is considered a standard blueprint. If you just read the section leading up to the implementation (which hopefully is displayed first), then just sit down with the book, and try to figure out how you could do that. If you can't do that at all, then you're too far ahead in your book and should backtrack and start again from scratch.
Thing about algorithms, they're essentially language-agnostic. There's really nothing stopping you from doing Sedgewick's examples in C, Python or some other language.
If you really don't know any other languages, concentrate on Java. Sure, its a bit repetitious, but those bits will stick in your head in a good way and come test time, you'll be glad for the information.
You're in an interesting position right now, since the kind of thinking required to write programs is very different from normal thinking. Add to that the fact you're learning a whole new language with a different syntax, punctuation and the like. Practice really does make perfect, since there are many bits and pieces to remember.
Oh, if you want practice with algorithms, try out project euler, code kata and other challenge sites. These little challenges can help you familiarize yourself with the language as well as get comfortable with the type of thinking required.
First, congrats on taking your first steps on learning how to code. I would say that you are already ahead of your peers by starting to look ahead during the summer.
As far as your fears on only being able to implement algorithms in Java, you have already demonstrated that it will not be a problem for you. It sounds like you are passionate enough to get started early so you should have no problem implementing a solution in multiple languages. Additionally most of the languages with C/C++ (Java and C# to name a few) like syntax will be similar enough that you will be able to translate your knowledge seamlessly.
The best advice that I can give is to CODE, CODE, CODE!! Don't just read about the algorithms actually implement them.
You don't say how well you know the mathematics behind the algorithms. That will be key in determining your facility with the code.
Sedgewick's books are very good. I'd feel free to pick some and check out other books as well, like "Numerical Recipes" and "Numerical Methods That Work". See if another point of view can clarify for you.
If you don't feel like you're getting enough out of copying Java, see if you can translate them into another language, maybe Python or purely functional alternative. If you can do that, you'll know you've got it.
I would either try to learn another language to verify that you can actually port it to another language (javascript would be my vote because it is simple and useful on the front and backend) or write the algorithms out in pseudocode since that is more language agnostic. Most languages will have the code look pretty similar. The only thing to be very careful about is when you are relying on some aspect of the language (such as generics or iterators in java) which you may not be able to use in another language and that could leave a gap in your understanding.
Another way to verify that you actually understand the algorithm is to make slight changes in the problem and make sure that you can adjust the algorithm to still work. For example if it is a sorting algorithm then try to sort by several different attributes rather than just one, if it is a graph algorithm make the graph a digraph and see how things should change.
I'm learning algorithms as if I'm learning to play the
guitar--repetition after repetition.
Then you are not learning algorithms. You are learning repetition. Two different things. The usage of a programming language by an algorithms book is a secondary factor. It is just a vehicle of instruction, an implementation detail.
What you should be focusing is on understanding the structure, logic and mathematical characteristics of an algorithm (and possibly the data structure(s) associated with it.)
That's what your focus should be.
But by doing so I feel like I'm being more fixated that I'll only able
to implement these in java.
But that is because you are focusing on just how the algorithm is being coded (in Java in this particular case.) You are focusing on an implementation detail.
When you learn to drive, you don't focus on how you learn to drive a Honda Civic or a Nissan Maxima. You learn the essence of what driving is, the rules of thumbs, the necessary precautions and the laws governing driving a vehicle.
Same with learning algorithms. You don't learn "Algorithms in Java" no more than "Algorithms in Haskell". You learn Algorithms first and foremost, the vehicle (sans very specialized cases) is secondary.
You should be focusing on what the algorithm does, how and why. Questions like "how/why does it work?" and most importantly *"what are the performance characteristics?", those are the things you should be focusing on.
Every good algorithms book (Sedgewick's included) carry that message. That's what you should focus on. How you get to that re-focusing, that's a function of one's personal learning strategies.
How exactly would you learn algorithms if the book you're using is language-specific?
By not focusing on the language. Focus on the structure, focus on the data structures involved, the invariants, pre-conditions and post-conditions. Understand asymptotic behavior described in Big-O (or Big-Omicron), Little-O/Little-Omicron and Omega notations.
You are learning algorithms, not programming in Java via coding algorithms.
If you can't do this mental leap, it means you do not have sufficient practice or abstract analysis. It is not an insult, but an observation and an advice. Coding, the usage of a programming language is typically secondary to the mathematical analysis of computing, the focus of Computer Science (of which Algorithms is a part thereof.)
NOTE I've done Java for over 10 years, and though I like it for work, I strongly believe it is a poor tool for learning programming or CS topics.
One is better served by learning Algorithms with either A) a procedural, systems-level programming language like C or Ada, or a high-level pseudo-assembler simulator, or B) a functional language like Lisp or Haskell.
Object-Oriented features in pure/pseudo-pure OO languages simply get in the way.
Algorithms are mathematical structures with a nature descriptive of the how (operationally) and/or the what (mathematically). The former is perfectly suited for procedural programming, the later for functional programming.

Functional programming and Haskell [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
I come from a background of 1 year programing in html/css/javascript/jQuery and 6 months in Java's JSP Servelets.I am in the 2nd year of college and in the last semester of the second year I didn't passed Functional Programing course in which we were learning Haskell(maybe mostly because i mised 90% of the clases). Seems in my second year I will also have a course in witch Haskell is involved so learning just the basics wont be enough.
What I am interested in is:
-the differences between OOP programming and function programming
-what book is recommended for a beginner in functioning programing using Haskell(I cant seem to make head or tail of what the professor wrote)
-where to go to practice the language after I'm done with the book
-what can I do with Haskell and cant do in Java
-do I need a lot of Math for understanding Haskell(My college professor used a lot of math related stuff in hes course)
the diferences beetween OOP programming and function programming
From your background, you probably don't know enough about OOP for comparisons to be useful. Just forget about it and learn functional programming as itself.
what book is recomended for a beginer in functioning programing using Haskell(I cant seem to make heade or tales of what the profesor wrote)
Everyone else keeps mentioning Learn You A Haskell for a reason. :]
where to go to practice the language after I'm done with the book
On your computer? Get the compiler, get a code editor, start programming. Learning by doing is the best way.
what can I do with Haskell and cant do in Java
Trivially, nothing. Both languages are capable of doing anything you might want to do, the end.
And again, you haven't spent enough time with Java for comparing the languages to be helpful anyway, so just learn Haskell as itself.
do I need alot of Math for understanding Haskell(My coledge profesor used a lot of math related stuff in hes course)
Not really. A little bit of discrete math and formal logic helps, though, but that's the sort of stuff you should get in any CS program anyway.
If you missed most of the lectures, then I haven't got a lot of sympathy. But I'll try to help.
1: Differences in OOP and Functional: big question. For now, I would try to approach Haskell with an open mind rather than trying to understand it in terms of its difference from OOP.
2: "Learn you a Haskell" and "Real World Haskell" are both available on the Net.
3: Work through the exercises in the books. Then look at the exercises in Project Euler.
4: Both are general purpose languages, so any application can be written in both. Haskell enables greater type safety and shorter code.
5: No, but the maths helps you understand it at a deeper level. I picked up the relevant math as I went along. Look up maths terminology on Wikipedia, and don't sweat it too much.
Start with Learn You a Haskell for Great Good. Also, look at the design of the jQuery library, as many of its features are designed around functional programming techniques.
Also, I highly recommend that you spend some time brushing up on your English skills as well. In my opinion, programming is a least as much about language as it is about math. From your comments so far, I suspect that your approach to both has been somewhat sloppy. That's going to be the hardest thing to overcome. As a general rule, programming systems are quite rigorous, and one way or another you'll need to learn to be more precise in the way you organize your thoughts.
i go with the opinions above - missing lectures is a bad thing and good English is a nice skill to have.
Of course - the already mentioned Learn you a Haskell for great good - is the location to start with.
Here are recordings of an exercise class in Germany, but the spoken language is English (with german accent).
And one thing you also should be aware of is: Hoogle, one if not the greatest things when it comes to learning haskell (imho) - a type searchable documentation:
if you search for a function that pulls out the end of a list - but you don't know the name:
hoogle: [a]-> a
lists all functions that have the given type signature - last, which is the function you looked for, is one of them.
Another thing - helping me develop my haskell skills is - syntastic in vim, a syntax checker, which sped the "compile - check - run"-cycle up by a massive amount, and hlint - a linting tool, that makes code much more readable - and shows you unnecessary stuff you added to your code, I really learned stuff from tiding up my code that way.
For starting with IO - there's this great article. It is also a great introduction how experienced functional programmers think.
And for advanced stuff and getting to know monads there is the Monad Reader recent stuff, older stuff a worthy thing to attack, I've heard, especially #13.
if you already know and like Java have a look at Clojure

Tutorial on C pointers and arrays from a Java standpoint

I'm currently a freshman in college, majoring in CS. I'm just about done with my "Intro to Computer Programming" class. I like it and feel like I'm learning a good bit.
A couple days ago, I read Joel's The Peril Of Java Schools. "A Linked List?" I thought, "those aren't even hard. We've done a bunch of those already in the class I'm in right now." Which is correct, because in Java, they're not that hard. But anyways, I tried to give writing one in C a try.
And it is SO HARD!
Joel was right, I think ... Java deals with so many little itsy-bitsy things for you that it's really not that hard. But I'm determined to overcome my school's Java-tude and learn how to write this dang linked list in C.
So I guess, instead of trying to ask lots and lots of little tiny questions, I am asking, does anyone know of a good (& free) online tutorial for learning C? Specifically, learning how to deal with pointers, and all those symbols (&, *, **, [] and how they work together) I'd like to think I'm already pretty proficient in Java, so I don't need the tutorials on how to write a "Hello, World!" program. But then I'm definitely not ready to get into any super-advanced C or C++ anything, because all I know is Java.
Any help appreciated!
Some tutorials:
Moving from Java to C++
Learning C from Java
C for Java Programmers course (course notes and slides)
Some good pointer answers which might help:
What are the barriers to understanding pointers and what can be done to overcome them?
What is a void pointer and what is a null pointer?
Arrays, what's the point?
What do people find difficult about C pointers?
Switching from Java to C++ - what's the easy way?
Is Java pass by reference?
The first is a damn good read about pointers and their pitfalls, if you can get past the Pascal syntax.
Check and see if your curriculum requires Systems Programming. Its usually a 300-level sophomore course, and I'm enrolled for that next semester. It is heavily involved working with C+GCC in Unix.
Check your CS dept library, if one exists. I picked up a copy of K&R to work on through winter break.
This is for C++, not C; but up until about Chapter 3.7 or so talks about stuff at the machine level in a way that's useful for would-be C programmers.
There are numerous guides across the internet for learning pointers. Here's one: http://pweb.netcom.com/~tjensen/ptr/pointers.htm which I've used.
I'm also going to suggest this book to you: Hacking, the Art of Exploitation 2nd Ed.
This book will not make you a "hacker". Nothing but lots of reverse engineering / studying binary code, trial and error etc is going to do that. It does, however, introduce to you how you start doing these things and that comes down to a fundamental understanding of how C works, including pointers. Its introduction to assembly/C is one of the best I've seen because it runs you through several C examples and how you investigate what's going on with gdb, a command line debugging tool. That way you can see the C and see the assembly. This includes a fundamental understanding of what pointers are.
This book will as a side-effect give you an introduction to the stack and the heap, data structures etc. In short, reading the intro sections will give you a lot of benefit for the rest of your course.

C for Java Programmer? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Should I learn C before learning C++?
As a professional (Java) programmer and heavy Linux user I feel it's my responsibility to learn some C (even though I may never use it professionally), just to make me a better coder.
Two questions :
Should I try C or C++ first - I realise they are different languages with some common ground. Is it useful to learn a bit of both, or just try one? I hear C++ is a bit of a nightmare behemoth of a language.
What are the best resources (books, tutorials, practice programs, reference code) for a Java developer like myself.
Thanks
C is simple to learn, difficult to master. as a Java programmer the barrier will be memory and structure .. and undoing the damage Java may have done to the algorithm producing portions of your brain ;)
I would recommend getting familiar with the GCC toolchain on your Linux box, through tutorials on the Internet. Then read The C Programming Language, and a copy of Linux Application Development doesn't hurt. Also, on Linux glib will save you from reinventing the wheel ... but at least try to create your own linked-list, hashmap, graph and other API to learn. Pointer arithmetic, arrays and learning that elements such as structs are just named-offsets in a binary chunk are all important. Spend time with malloc and free and memcheck. With C, your tools and toolchain are very important and the IDE isn't necessarily your friend when learning.
I would pick C over C++ as C is a good foundation to get used to the memory management and pointer usage of C.
The best thing you can do is apply what you learn to a real project. But be prepared to bash your head against the wall allot in Valgrind and GDB. I have been programming C for years, and I am still no C monk.
I do agree that C is a great language, it shows up a bad programmer. But remember:
Any sufficiently complicated C program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp.
The moral of which is learn other languages too, rather than just C-derived ones! Consider some Lisp dialect, Erlang (which is cool at the moment), Haskell, etc. They will expand your horizons from the 2x2 cell of Java. Consider looking at SICP too.
Coming from ASM, C, then C++, and finally landing (forced) into the Java territory, I thought I may provide an opinion about the subject.
First, with all due respect to the Java community, the business experience shows that while C/C++ programmers can get used to the Java principles and programming (it may not be that easy), the opposite happens more rarely. In other terms, a C++ programmer will have to learn and cope with tons of Java rules (Frameworks...) but she will eventually (and usually) be able to produce a long term working code by injecting a lot of her system experience into the process. A Java programmer going to C, used to more theoretical principles, and strict structure rules may
feel insecure as she has to decide many things, like program organization and structure
feel surprised with the pointers, and memory management: allocation and freeing, which has to be thought carefully - discovering the world of memory leaks
feel discouraged, as the bugs won't appear black on white in a console dictated by the JVM through 200 lines of stack trace, but may happen at a deeper / system level, maybe caught thanks to an IDE in front of which the Java programmer will contemplate some assembly code for the 1st time in her life
feel perplex as to what algorithm and how to implement it, the one that was integrated into Java that she never had to worry about...
So, now, my task is to help you to feel secure, confident, and motivated before learning C/C++!
My advice is to start with C, because
C by itself owns all the very concepts you never had to face with Java
as a Java programmer you already have a classes 'approach', and starting with C++, you may be tempted to stick to the Java OO principles
C principles are limited to a few. C looks like the very last human thing before entering the dark world of assembly language.
What I would emphasize during the C study is, for instance
Pointers Java uses pointers of course, but hides its access while actually passing classes to methods as pointers. In C, you will have to learn the difference between by value and by reference. Also, the more subtle difference between char x[3] and char *x = "ab". And how convenient pointers are to go through an array etc..., you know *++x or *x++. A lot has been said about pointers, my advice would be
always keep control, i.e. don't put too many indirections when not necessary
don't typedef pointers, like typedef int *pointerToInt ; it seems easier at first (pointerToInt pi instead of int *pi) after a few levels, I prefer to count the stars than the 'pointerTo' [some big cies do that]. Except maybe the pointers to functions, unreadable anyway.
Memory When you need memory, you allocate it, and when you don't need it anymore, you free it. Easy. The thing is to allocate the right amount, and not to free it twice... Have to get used to that. And get used to the heap, the stack... And When your program runs and address NULL (0) you may have a visible exception. Maybe not.
Data structure So you want to use a convenient HashMap? Sure, after you developed it. Or there are some libraries you can link that do that kind of thing, just chose the right one. To be fair, when programming in C, you [have to] think different, and may find a more appropriate algorithm than a map for a given problem.
All in all, you will feel disoriented at first, but eventually (unless you have nightmares) you will find before you a lot of room for fun and pleasure. C allows a person to program with complete freedom: C goes according to your ideas, not the opposite.
If the goal is to make you a better coder, aim for languages that actually try to be different. Java, C++ and C are all closely related.
True, one is primarily a procedural language, one tries really hard to pretend to be OOP, and one is a mix of at least 4 different paradigms, but they're all imperative languages, they all share a lot of syntax, and basically, they're all part of the same family of languages.
Learning C isn't going to teach you anything dramatically new. It might teach you a bit about memory layout and such, but you can learn that in many other ways, and it's just not very relevant to a Java programmer.
On the other hand, the language is relatively easy to pick up, and widely used for a lot of Linux software, so if you want to contribute to any of those, learning C is a good idea. It just won't make you a much better Java programmer.
As for C++, calling it a "nightmare behemoth of a language" probably isn't far from the truth. It is big and complex and full of wierd pitfalls and traps for the unwary beginner.
but it also has some redeeming qualities. For one thing, it is one of the only languages to support the generic programming paradigm, and that allows you to express many advanced concepts very concisely, and with a high degree of flexibility and code reuse.
It's a language that'll probably both make you hate C++ for being such an overengineered mess, and all other languages for missing C++ features that'd have made your code so much simpler.
Again, learning C++ won't make a huge difference to you as a Java programmer, except that it'll reveal a number of shortcomings in Java that you weren't aware of until now.
Learning either language is good, but what's better is learning something different.
Learn SML or Scheme or Haskell. Or Ruby, Python or Smalltalk. How about Erlang or Occam? Or Prolog.
Learn something that isn't either Java's sibling or ancestor, something that is designed from scratch to have nothing in common with Java. Learn a functional language like SML, or a dynamically typed one like Python, or one that radically changes how you deal with concurrency, like Erlang.
It depends on what you want to learn. I think it's probably best to sit back and consider why you really want to do this at all. If Java does what you want, and you're just doing this out of some misplaced sense of duty, I think there are probably better ways to spend your time. The reputation of C++ as a "nightmare behemoth" is spread mostly by insecure Java programmers trying to justify what, in their hearts, they know to have been a second-rate choice1.
There are a couple of books specifically written for Java programmers learning C and/or C++. Though it's not specifically for Java programmers, if you do decide on C++ rather than C, I'd consider Acclerated C++.
1I'm at least sort of joking, of course, but there are an almost amazing number of Java programmers who seem to have a chip on their shoulder. If you tell somebody who uses Python or Ruby (for just a couple of examples) that it's slow, the typical reaction is them looking a little puzzled and saying something like: "If you say so -- it seems fast enough to me." Suggesting the same about Java is practically guaranteed to produce claims that you're obviously ignorant and expressing nothing but blind hatred.
Edit: As far as choosing between C and C++ goes, for somebody accustomed to Java, C will simply be an exercise in frustration. The difference in language would require considerable adjustment anyway, but moving from a library the size of Java's to one the size of C's will simply result destroying his productivity for quite a while, and is more likely to just prejudice him again all languages with "C" in the name than help him actually learn anything.
Moving directly from Java to C is like taking somebody whose idea of a sporty car is when he drives the Lincoln Town Car instead of being chauffeured in the limousine, and when he decides racing is cool, you dump him directly in the seat of an Outlaw sprint car. Give him a chance in (not only much safer, but actually faster) street-legal sports car first...
Regarding (1), I'd probably say C. It's a lot more foreign. Since your goal is to be multilingual for its own sake, moving towards a language that is much different than Java will probably be more useful than learning C++, which will probably make you angry. C++ gets a lot of crap from people, and it's not necessarily awful, but the primary reason is that it is trying to force a new paradigm into the structure of C, which doesn't work as well as a language that starts with that paradigm in the first place.
For (2), I would highly, highly recommend K&R. It assumes some programming familiarity, is brief, to the point, but also is deep enough to explain concepts. It doesn't include exercises, however, which you'll have to find elsewhere. I learned C on the job, I'm afraid (and still paying for it!) so I can't give you educated help there.
Since you're doing this for self-fulfillment and learning, I say go for broke and give C++ a try.
A small preface before I elaborate: I used to work primarily with C++ and have never worked with C code of any significant size. Now I work with C# for the most part, only using C++ on rare occasions.
I think C++ is a better option because:
It's a partial super-set of C: C programs will generally not compile as C++ programs, but the overlap between the two languages is substantial enough that it shouldn't be difficult for you to re-target your skills to work with C code if you need to.
C++ will introduce you to more concepts: You'll get all the fun of memory management and bit twiddling that you can do in C. But you'll also get to see generics like you've never seen them before, how algorithms can be written independently from containers, how to do compile-time polymorphism, how multiple inheritance can be actually useful sometimes, etc.
You'll learn to appreciate the design of the Java language a lot more: C++ is a complicated languages with many gotchas and edge cases (see the FAQ and the FQA for some examples). By experiencing them for yourself, you'll be able to better understand many of the design decisions that went into making both Java and C#.
It boils down to this: The more you learn the more that you'll be able to learn. And C++ forces a lot of learning on you, definitely more than C. And that's a good thing.
C++ will probably feel more familiar to you than C, and will probably be easier to get productive with off the bat. However, C is a much smaller language and should be reasonably straightforward to learn (although beware; by learning C you risk permanent brain damage). My personal reference is "C: A Reference Manual" by Harbison & Steele (currently 5th edition). For C++, I just use the O'Reilly nutshell book.
As a C programmer with some C++ experience and currently making the transition to Java, I can tell you the things about C that are probably going to trip you up almost immediately:
C has very little in the way of abstractions; pointers and byte streams are pretty much it. There are no standard container types (lists, maps, etc.). You want anything more sophisticated than a fixed-length array, you will have to roll your own (or use a library developed by someone else).
There is no such thing as garbage collection in C. Every byte you allocate dynamically (via malloc() or calloc()) you are on the hook for deallocating (via free()).
Arrays in C do not behave like arrays in Java; there are some funky rules regarding array types that at first blush do not make sense (and won't until sufficient brain damage has set in). There is no bounds checking on arrays, and some standard library functions (notably gets() and scanf()) make buffer overrun exploits a real risk.
C declaration syntax can get pretty twisted. While you probably won't see anything quite so ugly, declarations like int *(*(*(*f)())[10])(); are possible (f is a pointer to a function returning a pointer to a 10-element array of pointers to functions returning pointer to int`.
C implementation limits can vary from platform to platform; for example, the language standard only mandates minimum ranges for types like short, int, and long, but they may be wider than the minimum requirements. If you're expecting an int to always be the same size regardless of platform, you're in for some surprises.
Text processing in C is a pain in the ass. Seriously. C does not have a string type as such.

Syntax analysis question

In school we were assigned to design a language and then to implement it, (I'm having so much fun implementing it =)). My teacher told us to use yacc/lex, but i decided to go with java + regex API, here is how the the language I designed looks:
Program "my program"
var yourName = read()
if { equals("guy1" to yourName) }
print("hello my friend")
else
print("hello extranger")
end
Program End
Well, as you can see, its a pretty basic language =).
I thought I could implement it in a very OOP fashion, like make an abstract class Sentence and then have subclasses like VariableAssignment, IfSentence etc. and have a class Program which is only a bunch of sentences right? And then call an abstract method eval on all Sentences, so my initial approach to complie the language consisted only of two phases:
Identify syntax of seach line
Create the correspondig class for each line
of course, if something goes wrong on any phase Ii could raise an error.
My question is, am I doing it wrong? Should I go over all phases like the theory says (lexical, syntactical, semantical)? Should I continue with my naive two-phase compiler?
I won't ask the obvious question of why you're not following the advice of your instructor and using yacc/lex because I know the answer. You wanted to go off and do something that you thought was cool and would help you learn. Unfortunately, that approach was recommended by your professor because as another posted stated, a lot of very smart people before you have explored multiple approaches and spent vast quantities of time trying to find a good solution.
You can make a two-phase compiler work, but you will need to accept that it will never be as good as going through the full process because it's harder to detect errors. A lot harder in fact. In some cases, you won't even be able to tell that there's an error until it's too late. ie: already compiled and attempting to run.
If you want to learn a lot more about it, go with the two phase approach and you will run into the same problems that the people before you ran into. Just be sure to understand that it will take you a lot longer to get to a final solution, you might be docked points on your project, and it might not work right.
That said, you're going to learn more about it than anyone else in the class. If you have the time to spare, I'd do it the way you are now. The knowledge might come in handy down the road. I would also talk to your professor and tell him that you're going to do it another way against his recommendations because you want to have a more thorough understanding. Perhaps he won't knock points off from your project for being ambitious, even if it turns out wrong.
After all, the point of doing projects in college is to learn.
A lot of smart people thought about this, and from your post I take, they came to the conclusion that all the phases are needed.
So if you want your compiler to work, go the way the theory dictates.
If you want to understand, why it dictates the phases, try the short cut. It will probably take a lot longer.
Disclaimer: I have no idea about compiler theory
Another note: You have a problem; You decide to solve it using regexps; Now you have two problems
If you use regexes to parse each line your language would have a very limited syntax.
You would not be able to parse each line using just a regular expression API if your syntax becomes more complex. Even the if { equals("guy1" to yourName) } would become impossible to parse with regexes if you start adding AND and OR operators, and what would happen if you start supporting escape characters like \n in your string literals?
The Java Regex API would be able to help you with the lexical analyzer, but you would have to write the parser from there. You could take one of several approaches:
If you're using Java, you could look at Antlr (which negates the need for writing a lexicall analyzer with Java's regex library), or
You could write a recursive descent parser by hand
among others
(also, "Statement" is a synonym for "Sentence" that is more common in compiler texts)
If you want to use only regular expressions to parse your language, your language can only be regular. This is a big constriction, for example, arbitrarily deep nesting would be impossible, as you would have to teach your parser each nesting combination separately. I am not sure if building a Turing-complete regular language is even possible.
If u really want to dirty ur hands code a recursive descent parser. If you want to understand compiler theory use antlr and concentrate on the principles leaving the implementation for the parser generator.
BTW, why would wnat to complicate your life with regex?!

Categories