I've started with JOGL lately, I know how to create and draw objects on the canvas, but I couldn't find tutorial or explanations on how to set and rotate the camera.
I only found source code, but since I'm quite new with this, it doesn't help too much.
Does anyone know of a good tutorial or place to start? I googled but couldn't find anything (only for JOGL 1.5, and I'm using 2.0).
As datenwolf points out my explanation is tied to the OpenGL 2 pipeline, which has been superseded. This means you have to do your own manipulation from world space into screen space if you want to eschew the deprecated methods. Sadly, this little footnote hasn't gotten around to being attached to every last bit of OpenGL sample code or commentary in the universe yet.
Of course I don't know why it's necessarily a bad thing to use the existing GL2 pipeline before picking a library to do the same or building one yourself.
I'm playing around with JOGL myself, though I have some limited prior experience with OpenGL. OpenGL uses two matrices to transform all the 3D points you pass through it from 3D model space into 2D screen space, the Projection matrix and the ModelView matrix.
The projection matrix is designed to compensate for the translation between the 3D world and the 2D screen, projecting a higher dimensional space onto a lower dimensional one. You can get lots more details by Googling gluPerspective, which is a function in the glut toolkit for setting that matrix.
The ModelView1 matrix on the other hand is responsible for translating 3D coordinates items from scene space into view (or camera) space. How exactly this is done depends on how you're representing the camera. Three common ways of representing the camera are
A vector for the position, a vector for the target of the camera, and a vector for the 'up' direction
A vector for the position plus a quaternion for the orientation (plus perhaps a single floating point value for scale, or leave scale set to 1)
A single 4x4 matrix containing position, orientation and scale
Whichever one you use will require you to write code to translate the representation into something you can give to the OpenGL methods to set up the ModelView matrix, as well as writing code than translates user actions into modifications to the Camera data.
There are a number of demos in JOGL-Demos and JOCL-Demos that involve this kind of manipulation. For instance, this class is designed to act as a kind of primitive camera which can zoom in and out and rotate around the origin of the scene, but cannot turn otherwise. It's therefore represented as only 3 floats: and X and Y rotation and a Z distance. It applies its transform to the Modelview something like this2:
gl.glTranslatef(0, 0, z);
gl.glRotatef(rotx, 1f, 0f, 0f);
gl.glRotatef(roty, 0f, 1.0f, 0f);
I'm currently experimenting with a Quaternion+Vector+Float based camera using the Java Vecmath library, and I apply my camera transform like this:
Quat4d orientation;
Vector3d position;
double scale;
public void applyMatrix(GL2 gl) {
Matrix4d matrix = new Matrix4d(orientation, position, scale);
double[] glmatrix = new double[] {
matrix.m00, matrix.m10, matrix.m20, matrix.m30,
matrix.m01, matrix.m11, matrix.m21, matrix.m31,
matrix.m02, matrix.m12, matrix.m22, matrix.m32,
matrix.m03, matrix.m13, matrix.m23, matrix.m33,
gl.glLoadMatrixd(glmatrix, 0);
1: The reason it's called the ModelView and not just the View matrix is because you can actually push and pop matrices on the ModelView stack (this is true of all OpenGL transformation matrices I believe). Typically you either have a full stack of matrices representing various transformations of items relative to one another in the scene graph, with the bottom one representing the camera transform, or you have a single camera transform and keep everything in the scene graph in world space coordinates (which kind of defeats the point of having a scene graph, but whatever).
2: In practice you wouldn't see the calls to gl.glMatrixMode(GL2.GL_MODELVIEW); in the code because the GL state machine is simply left in MODELVIEW mode all the time unless you're actively setting the projection matrix.
but I couldn't find tutorial or explanations on how to set and rotate the camera
Because there is none. OpenGL is not a scene graph. It's mostly sophisticated canvas and simple point, line and triangle drawing tools. Placing "objects" actually means applying a linear transformations to place a 3 dimensional vector on a 2D framebuffer.
So instead of placing the "camera" you just move around the whole world (transformation) in the opposite way you'd move the camera, yielding the very same outcome.
Over the past few weeks I've been attempting to learn the libGDX library. I'm finding it hard, especially for my first endeavor toward game development, to comprehend the system of Camera/viewport relationships. One line of code that I've been told to use, and the API mentions, is:
Despite a good 4 hours of research, I'm still lacking a complete understanding of the functionality of this code. It is to my basic understanding that it "tells" the batch where the camera is looking. My lack of comprehension is depressing and angering, and I'd appreciate if anyone could assist me. Another issue with the code snippet is that I'm unsure of when it's necessary to implement (in the render method, create method, etc).
Consider taking a picture with a camera. E.g. using your smartphone camera taking a picture of a bench in the park. When you do that, then you'll see the bench in the park on the screen of your smartphone. This might seem very obvious, but let's look at what this involves.
The location of the bench on the picture is relative to where you were standing when taking the photo. In other words, it is relative to the camera. In a typical game, you don't place object relative to the object. Instead you place them in your game world. Translating between your game world and your camera, is done using a matrix (which is simply a mathematical way to transform coordinates). E.g. when you move the camera to the right, then the bench moves to the left on the photo. This is called the View matrix.
The exact location of the bench on the picture also depends on the distance between bench and the camera. At least, it does in 3D (2D is very similar, so keep reading). When it is further away it is smaller, when it is close to the camera it is bigger. This is called a perspective projection. You could also have an orthographic projection, in which case the size of the object does not change according to the distance to the camera. Either way, the location and size of the bench in the park is translated to the location and size in pixels on the screen. E.g. the bench is two meters wide in the park, while it is 380 pixels on the photo. This is called the projection matrix.
camera.combined represents the combined view and projection matrix. In other words: it describes where things in your game world should be rendered onto the screen.
Calling batch.setProjectionMatrix(cam.combined); instruct the batch to use that combined matrix. You should call that whenever the value changes. This is typically when resize is called and also whenever you move or otherwise alter the camera.
If you are uncertain then you can call that in the start of your render method.
The other answer is excellent, but I figure a different way of describing it might help it to click.
You generally deal with your game in "world space", a coordinate system that is analogous to the real world. In linear algebra, you can convert points in space from one coordinate system to another by multiplying the point's coordinates by a matrix that represents the relation between two coordinate systems.
The view matrix is multiplied by a point to convert it from world space to camera space (the camera's point of view). The projection matrix is used to convert a point from camera space to screen space (the flat 2D rectangle of your device's screen). When you call update() on a camera in Libgdx, it applies your latest changes to position, orientation, viewport size, field of view, etc. to its view and projection matrices so they can be used in shaders.
You rarely need to deal with stuff in camera space in 2D, so SpriteBatch doesn't need separate view and projection matrices. They can be combined into a single matrix that converts straight from world space to screen space, which is already done automatically in the Camera, hence the camera.combined matrix.
SpriteBatch has a default built-in shader that multiplies this projection matrix by all the vertices of your sprites so they will be properly mapped to the flat screen.
You should call setProjectionMatrix whenever you have moved the camera or resized the screen.
There is a third type of matrix called a model matrix that is used for 3D stuff. A model matrix describes the model's orientation, scale, and position in world space. So it is multiplied by coordinates in the model to move them from local space to world space.
Take for example a basic sidescrolling game. As the player moves to the side, the camera pans to follow them. This means that where objects are in the world doesn't necessarily correspond to where they are on the screen, since the screen and the world move relative to each other.
Here's an example: say your screen is 100px*100px square (for some reason). You place an object at position (50, 0), so it's now in the middle and at the bottom of the screen. Now say you move your player over to the right, and the whole screen pans to follow the player. This means that the object you placed earlier should have moved left on the screen. So it's still at (50, 0) in the world, since it didn't actually move relative to the rest of the scenery, but it should now be drawn at, say, (10, 0) on the screen, since which part of the world the screen is looking at has changed. This is the difference between "worldspace" (where an object is in the world) and "screenspace" (where the object is drawn on the actual display).
When you try to draw with a SpriteBatch, it is by default going to assume worldspace coordinates are the same as screenspace coordinates: when you say "draw at (50, 0)", it's going to draw the object at (50, 0) on the screen. Even if the camera moves, it's always going to draw at (50, 0) on the screen, so as the camera pans, the object will follow and stay stuck to the same place on the screen.
Since you usually don't want that, you give the SpriteBatch a projection matrix, which is a transformation matrix that tells how to convert screenspace coordinates to worldspace coordinates, and vice versa. This way, when you tell the batch "draw at (50, 0)", it can look at the matrix it got from the camera and see that, since the camera has moved, (50, 0) in the world actually means (10, 0) on the screen, and it will draw your sprite in the right place.
I am currently drawing a single transparent 3D mesh, generated via a marching cubes algorithm, with the intention of having more objects once the problem is fixed.
As it stands, I can draw 3d shapes perfectly well but when I implement transparency (in my case changing the opacity of the meshes PhongMaterial) I get a weird effect where only a few triangles are rendered when behind another triangle.
see example.
(sorry, I was unable to post the image directly, due to rep)
When the "stick" is behind the larger shape there seems to be a loss in triangles and I currently have no idea why.
The red is all the same mesh rendered in the same way.
I am currently using an ambient light if that makes a difference.
Some example code:
MeshView mesh = generate Mesh Data via marching cube;
PhongMaterial mat = new PhongMaterial(1, 0, 0, 0.5d);
AmbientLight light = new AmbientLight();
light.setColor(new Color(1, 0, 0, 0.5)); // I dont believe the alpha makes a difference
group.getChildren().addAll(light, mesh);
Transparency only works correctly when the triangle faces are sorted by distance to the camera. This is an artifact of the fact that consumer 3D cards break any scene down to the triangles and so they can render each one individually. This allows to render hundreds of triangles at the same time when you have hundreds of cores. Older cards show you the number of triangles/second which they can render.
On more modern cards, part of the triangle rendering has been moved to the driver which uses the vector engines on the card to calculate the color of each point in software. This is still fast since you can have 1000+ vector CPUs plus it allows you to create complex programs that modify each vertex/pixel before it's stored in memory which allows you to create shiny surfaces, etc.
I am working on an OpenGL game in Java with LWJGL (ThinMatrix's tutorials at the moment) and I just added my skybox. As you can see from the picture, however, it is clipping through the trees and covering everything behind a certain point.
Here is my rendering code for the skybox:
public void render(Camera camera, float r, float g, float b) {
shader.loadFogColor(r, g, b);
GL11.glDrawArrays(GL11.GL_TRIANGLES, 0, cube.getVertexCount());
private void bindTextures() {
GL11.glBindTexture(GL13.GL_TEXTURE_CUBE_MAP, texture);
GL11.glBindTexture(GL13.GL_TEXTURE_CUBE_MAP, nightTexture);
also if it is needed, here is my code for my master renderer:
public void render(List<Light> lights, Camera camera){
shader.loadSkyColor(RED, GREEN, BLUE);
terrainShader.loadSkyColor(RED, GREEN, BLUE);
skyboxRenderer.render(camera, RED, GREEN, BLUE);
There are two things you can do
If you draw your skybox first, you can disable your depth test glDisable(GL_DEPTH_TEST) or your depth write glDepthMask(false). This will prevent that your skybox draws depth values, and the skybox will never be in front of anything that will be drawn later.
If you draw your skybox last, you can make it literally infinitely big by using vertex coordinates with a w-coordinate as 0. A vertex (x y z 0) means it is a vertex infinitely far in the direction of the vector (x y z). To prevent clipping, you have to enable depth clamping glEnable(GL_DEPTH_CLAMP) this will prevent OpenGl to clip away your skybox faces, and you are sure that the skybox is always at the maximum distance and will never hide anything you have drawn earlier.
the advantage of the second method is within the depth test. Because you already have a depth values written for your scene, the OpenGL pipeline can skip the calculation of the skybox pixels that are already covered by your scene. But the fragment shader for skyboxes is usually very trivial, so it shouldn't make that much of a difference.
I am not familiar with LWJGL, are you alllowed to write shader? In plain opengl, you don't have to worry about the size of skybox cube, it can be {1.0, 1.0, 1.0} if you like. What you need is first place your camera at {0.0, 0.0, 0.0} and make skybox fail depth test against everything in your scene, you can achieve that by making the skybox's z value in normalized device coordinate be 1.0.
Do this in your vertex shader
gl_Position = (mvp_mat * vec4(xyz, 1.0)).xyww;
after the perspective divide by w, z will be w / w or 1.0.
You might want to check out How can I increase distance (zfar/gluPerspective) where openGL stops drawing objects?
The problem in that instance is that the skybox itself was too small and intersecting with the geometry.
I also see that you're rendering your terrain first, and then your skybox. I would try flipping the order there; draw the skybox first then the terrain.
First, you should remove the skybox and render the scene again to check if it is skybox that clip the tree.
If it is skybox, simply scale the skybox to make it contain all the object in the terrain.
If not, it is likely to be the problem of camera (like Hanston said). You need to set the far clipping plane at least behind the skybox. That is, it should be larger the diameter of your skybox.
If you want to scale the skybox or any other object, use the transformationMatrix. the game engine use a 4x4 matrix to control the size, location and rotation of the model. you can see example in source TerrainRenderer.java, at function loadModelMatrix. It create a transform matrix and uploads it into the shader. You should do the same thing, but change the scale parameter into what you want.
I'm experimenting with LibGDX and 3D in a projection view. Right now I'm looking at how to determine the outermost bounds of my viewport in world space at z=0.0, in order to draw coordinate grid no larger than necessary. However, I seem to have outpaced my education in that I haven't taken a formal linear algebra class and am still a little fuzzy on matrix math.
Is there a way to determine where I should start and stop drawing lines without resorting to using picking and drawing a transparent plane to intersect with?
LibGDX's unproject function takes screen coordinates in a Vector3 and returns a Vector3 in world space from the near clipping plane to the far, given the provided z. However, given that I have a translated and rotated Camera (an encapsulation of the viewprojection matrix and a slew of convenience methods), it occurs to me that I can't pick an arbitrary z to put in the window coordinate vector and just set it to 0.0 after unprojection, as that point probably won't be the furthest viewable point in the viewport. So how do I know what z value to use in the window coordinate that will give the the x and y I need in world space that's at z=0.0?
So apparently it looks like the problem I'm looking at is plane intersection, which would require ray tracing. So now I suppose my question is this: is ray tracing 4 times per render loop (or, I suppose whenever the camera has moved) worth the payoff of being able to dynamically draw a worldspace coordinate grid no larger than the viewport? If not, is there a cheaper algorithm I can use to estimate where I should start and stop drawing lines?
I have been searching for a introductory to 2D selection in OpenGL ES in Stack Overflow. I mostly see questions about 3D.
I'm designing a 2D tile-based level editor on Android 4.0.3, using OpenGL ES. In the level editor, there is a 2D, yellow, square object placed in the center of the screen. All I wanted is to detect to see if the object has been touched by a user.
In the level editor, there aren't any tiles overlapping. Instead, they are placed side-by-side, just like two nearby pixels in a bitmap image in MS Paint. My purpose is to individually detect a touch event for each square object in the level editor.
The object is created with a simple vertex array, and using GL_TRIANGLES to draw 2 flat right triangles. There are no manipulations and no loading from a file or anything. The only thing I know is that if a user touches any one of the yellow triangles, then both yellow triangles are to be selected.
Could anyone provide a hint as to how I need to do this? Thanks in advance.
This is the draw() function:
public void draw(GL10 gl) {
gl.glTranslatef(-(deltaX - translateX), (deltaY - translateY), 1f);
gl.glColor4f(1f, 1f, 0f, 1f);
//TODO: Move ClientState and MatrixStack outside of draw().
gl.glVertexPointer(2, GL10.GL_FLOAT, 0, vertices);
gl.glDrawArrays(GL10.GL_TRIANGLES, 0, 6);
I'm still missing some info. Are you using a camera? or pushing other
matrixes before the model rendering?. For example, if you are using an
orthographic camera, you can easily unproject your screen coordinates
[x_screen, y_screen] like this (y is analogous):
I'm not using a camera, but I'm probably using an orthographic projection. Again, I do not know, as I'm just using a common OpenGL function. I do pushing and popping matrices, because I plan on integrating many tiles (square 2D objects) with different translation matrices. No two tiles will have the same translation matrix M.
Is a perspective projection the same as orthographic projection when it comes to 2D? I do not see any differences between the two.
Here's the initial setup when the surface is created (a class extending GLSurfaceView, and implementing GLSurfaceView.Renderer):
public void onSurfaceChanged(GL10 gl, int width, int height) {
gl.glViewport(0, 0, width, height);
public void onSurfaceCreated(GL10 gl, EGLConfig arg1) {
public void onDrawFrame(GL10 gl) {
gl.glOrthof(0f, super.getWidth(), 0f, super.getHeight(), 1, -1);
private void clearScreen(GL10 gl) {
gl.glClearColor(0.5f, 1f, 1f, 1f);
A basic approach would be the following:
Define a bounding box for each "touchable" object. This could be
just a rectangle (x, y, width, height).
When you update a tile in the world you update its
bounding box (completely in world coordinates).
When user touches the screen, you have to unproject screen
coordinates to world coordinates
Check if unprojected point overlaps with any bounding box.
Some hints on prev items.[Edited]
1 and 2. You should have to keep track of where you are rendering
your tiles. Store their position and size. A rectangle is a
convenient structure. In your example it could be computed like
this. And you have to recompute it when model changes. Lets call it Rectangle r:
r.x = yourTile.position.x -(deltaX - translateX)
r.y = yourTile.position.y -(deltaY - translateY)
r.width= yourTile.width //as there is no model scaling
r.height = yourTile.height//
3 - if you are using
an orthographic camera, you can easily unproject your screen
coordinates [x_screen, y_screen] like this (y is analogous):
x_model = ((x_screen/GL_viewport_width) -0.5 )*camera.WIDTH + Camera.position.x
4 - For each of your Rectangles check if [x_model; y_model] is inside it.
[2nd Edit] By the way you are updating your matrixes, you can consider you are using a camera with postition surfaceView.width()/2, surfaceView.height()/2. You are matching 1 pixel on screen to 1 unit in world, so you dont need to unproject anything. You can replace that values on my formula and get x_screen = x_model - (You 'll need to flip the Y component of the touch event because of the Y grows downwards in Java, and upwards in GL).
Final words. If user touches point [x,y] check if [x, screenHeight-y]* hits some of your rectangles and you are done.
Do some debugging, log the touching points and see if they are as expected. Generate your rectangles and see if they match what you see on screen, then is a matter of checking if a point is inside a rectangle.
I must tell you that you should not set the camera to screen dimensions, because your app will look dramatically different on different devices. This is a topic on its own so i won't go any further, but consider defining your model in terms of world units - independent from screen size. This is getting so off-topic, but i hope you have gotten a good glimpse of what you need to know!
*The flipping i told you.
PS: stick with the orthographic projection (perspective would be more complex to use).
Please, allow me to post a second answer to your question. This is completely more high-level/philosophical. May be a silly, useless answer but, I hope it will help someone new to computer graphics to change it's mind to "graphics mode".
You can't really select a triangle on the screen. That square is not 2 triangles. That square is just a bunch of yellow pixels. OpenGL takes some vertices, connects them, process them and colors some pixels on the screen. At one stage on the graphics pipeline even geometrical information is lost, and you only have isolated pixels. That's analogous to a letter printed by a printer on a paper. You usually don't process information from a paper (ok, maybe a barcode reader does :D)
If you need to further process your drawings, you have to model them and process them yourself with auxiliary data structures. That's why I suggested you created a rectangle to model your tiles. You create your imaginary "world" of objects, and then render them to screen. The user touch-event does not belong to the same world, so you have to "translate" screen coordinates into your world coordinates. Then you change something in your world (may be the user drags her finger and you have to move an object), and back again tell OpenGL to render your world to screen.
You should operate on your model, not the view. Meshes are more of a view thing, so you shouldn't mix them with the model information, it's a good practice to separate both things. (please, an expert correct me, I'm quite a graphics hobbyist)
Have you checked out LibGDX?
Makes life so much easier when working with OpenGL ES.