Merged with Grails addTo in for loop.
I am facing a problem due to that i'm newbie to grails
i'm doing a website for reading stories and my goal now is to do save the content of the story into several pages to get a list and then paginate it easily .. so i did the following.
in the domain i created two domains one called story and have this :
class Story {
String title
List pages
static hasMany=[users:User,pages:Page]
static belongsTo = [User]
static mapping={
users lazy:false
pages lazy:false
}
}
and have of course domain called page have this :
class Page {
String Content
Story story
static belongsTo = Story
static constraints = {
content(blank:false,size:3..300000)
}
}
and the controller saving method gone like this:
def save = {
def storyInstance = new Story(params)
def pages = new Page(params)
String content = pages.content
String[] contentArr = content.split("\r\n")
int i=0
StringBuilder page = new StringBuilder()
for(StringBuilder line:contentArr){
i++
page.append(line+"\r\n")
if(i%10==0){
pages.content = page
storyInstance.addToPages(pages)
page =new StringBuilder()
}
}
if (storyInstance.save(flush:true)) {
flash.message = "${message(code: 'default.created.message', args: [message(code: 'story.label', default: 'Story'), storyInstance.id])}"
redirect(action: "viewstory", id: storyInstance.id)
}else {
render(view: "create", model: [storyInstance: storyInstance])
}
}
i know it looks messy but it's a prototype..any way.. the problem is that i'm waiting from "storyInstance.addToPages(pages)" line to add to the set of pages an instance of the pages every time the condition is true..but what actually happen that it's give me the last instane only with the last page_idx while i thought it should save the pages one by one and so i can get a list of pages to every story..
why this happen and is there a simpler way to do it than what i did..
i'm waiting for any help here that is appreciated
Related
I'm trying to extract data from a webpage, for example, lets say I wish to fetch information from chess.org.
I know the player's ID is 25022, which means I can request
http://www.chess.org.il/Players/Player.aspx?Id=25022
In that page I can see that this player's fide ID = 2821109.
From that, I can request this page:
http://ratings.fide.com/card.phtml?event=2821109
And from that I can see that stdRating=1602.
How can I get the "stdRating" output from a given "localID" input in Java?
(localID, fideID and stdRating are aid parameters that I use to clarify the question)
You could try the univocity-html-parser, which is very easy to use and avoids a lot of spaghetti code.
To get the standard rating for example you can use this code:
public static void main(String... args) {
UrlReaderProvider url = new UrlReaderProvider("http://ratings.fide.com/card.phtml?event={EVENT}");
url.getRequest().setUrlParameter("EVENT", 2821109);
HtmlElement doc = HtmlParser.parseTree(url);
String rating = doc.query()
.match("small").withText("std.")
.match("br").getFollowingText()
.getValue();
System.out.println(rating);
}
Which produces the value 1602.
But getting data by querying individual nodes and trying to stitch all pieces together is not exactly easy.
I expanded the code to illustrate how you can use the parser to get more information into records. Here I created records for the player and her rank details which are available in the table of the second page. It took me less than 1h to get this done:
public static void main(String... args) {
UrlReaderProvider url = new UrlReaderProvider("http://www.chess.org.il/Players/Player.aspx?Id={PLAYER_ID}");
url.getRequest().setUrlParameter("PLAYER_ID", 25022);
HtmlEntityList entities = new HtmlEntityList();
HtmlEntitySettings player = entities.configureEntity("player");
player.addField("id").match("b").withExactText("מספר שחקן").getFollowingText().transform(s -> s.replaceAll(": ", ""));
player.addField("name").match("h1").followedImmediatelyBy("b").withExactText("מספר שחקן").getText();
player.addField("date_of_birth").match("b").withExactText("תאריך לידה:").getFollowingText();
player.addField("fide_id").matchFirst("a").attribute("href", "http://ratings.fide.com/card.phtml?event=*").getText();
HtmlLinkFollower playerCard = player.addField("fide_card_url").matchFirst("a").attribute("href", "http://ratings.fide.com/card.phtml?event=*").getAttribute("href").followLink();
playerCard.addField("rating_std").match("small").withText("std.").match("br").getFollowingText();
playerCard.addField("rating_rapid").match("small").withExactText("rapid").match("br").getFollowingText();
playerCard.addField("rating_blitz").match("small").withExactText("blitz").match("br").getFollowingText();
playerCard.setNesting(Nesting.REPLACE_JOIN);
HtmlEntitySettings ratings = playerCard.addEntity("ratings");
configureRatingsBetween(ratings, "World Rank", "National Rank ISR", "world");
configureRatingsBetween(ratings, "National Rank ISR", "Continent Rank Europe", "country");
configureRatingsBetween(ratings, "Continent Rank Europe", "Rating Chart", "continent");
Results<HtmlParserResult> results = new HtmlParser(entities).parse(url);
HtmlParserResult playerData = results.get("player");
String[] playerFields = playerData.getHeaders();
for(HtmlRecord playerRecord : playerData.iterateRecords()){
for(int i = 0; i < playerFields.length; i++){
System.out.print(playerFields[i] + ": " + playerRecord.getString(playerFields[i]) +"; ");
}
System.out.println();
HtmlParserResult ratingData = playerRecord.getLinkedEntityData().get("ratings");
for(HtmlRecord ratingRecord : ratingData.iterateRecords()){
System.out.print(" * " + ratingRecord.getString("rank_type") + ": ");
System.out.println(ratingRecord.fillFieldMap(new LinkedHashMap<>(), "all_players", "active_players", "female", "u16", "female_u16"));
}
}
}
private static void configureRatingsBetween(HtmlEntitySettings ratings, String startingHeader, String endingHeader, String rankType) {
Group group = ratings.newGroup()
.startAt("table").match("b").withExactText(startingHeader)
.endAt("b").withExactText(endingHeader);
group.addField("rank_type", rankType);
group.addField("all_players").match("tr").withText("World (all", "National (all", "Rank (all").match("td", 2).getText();
group.addField("active_players").match("tr").followedImmediatelyBy("tr").withText("Female (active players):").match("td", 2).getText();
group.addField("female").match("tr").withText("Female (active players):").match("td", 2).getText();
group.addField("u16").match("tr").withText("U-16 Rank (active players):").match("td", 2).getText();
group.addField("female_u16").match("tr").withText("Female U-16 Rank (active players):").match("td", 2).getText();
}
The output will be:
id: 25022; name: יעל כהן; date_of_birth: 02/02/2003; fide_id: 2821109; rating_std: 1602; rating_rapid: 1422; rating_blitz: 1526;
* world: {all_players=195907, active_players=94013, female=5490, u16=3824, female_u16=586}
* country: {all_players=1595, active_players=1024, female=44, u16=51, female_u16=3}
* continent: {all_players=139963, active_players=71160, female=3757, u16=2582, female_u16=372}
Hope it helps
Disclosure: I'm the author of this library. It's commercial closed source but it can save you a lot of development time.
As #Alex R pointed out, you'll need a Web Scraping library for this.
The one he recommended, JSoup, is quite robust and is pretty commonly used for this task in Java, at least in my experience.
You'd first need to construct a document that fetches your page, eg:
int localID = 25022; //your player's ID.
Document doc = Jsoup.connect("http://www.chess.org.il/Players/Player.aspx?Id=" + localID).get();
From this Document Object, you can fetch a lot of information, for example the FIDE ID you requested, unfortunately the web page you linked inst very simple to scrape, and you'll need to basically go through every link on the page to find the relevant link, for example:
Elements fidelinks = doc.select("a[href*=fide.com]");
This Elements object should give you a list of all links that link to anything containing the text fide.com, but you probably only want the first one, eg:
Element fideurl = doc.selectFirst("a[href=*=fide.com]");
From that point on, I don't want to write all the code for you, but hopefully this answer serves as a good starting point!
You can get the ID alone by calling the text() method on your Element object, but You can also get the link itself by just calling Element.attr('href')
The css selector you can use to get the other value is
div#main-col table.contentpaneopen tbody tr td table tbody tr td table tbody tr:nth-of-type(4) td table tbody tr td:first-of-type, which will get you the std score specifically, at least with standard css, so this should work with jsoup as well.
I've been looking for a solution to this for awhile. Working on a small side project to play around mini search engines. I've created a series of java classes that crawl through a certain amount of links through a webpage and stores the information into a JDBM RecordManager HTree.
When I run a print function for the contents of this RecordManager, I can get the contents just fine, but when I try to imitate this on a JSP file on my Tomcat server, the object that is supposed to be returned by this print function is empty. (Note: I have a HTML page that sends the necessary string to this JSP file)
The DataManager object, when called, is supposed to "create and Initialize a RecordManager"
The querySimilarity function is supposed to return Vector of integer pageIDs taken from the generated RecordManager.
Any ideas? The code below is from my JSP file.
<%# page import="java.util.Vector,searchEngine.*,jdbc.*" %>
<%
out.println("The words you entered are: <br>");
String arr = request.getParameter("words");
String[] a = arr.split(" ");
Vector<Integer> pageIDList = new Vector<Integer>();
DataManager dm = new DataManager();
pageIDList = dm.querySimilarity(a);
for(int i = 0; i < pageIDList.size(); i++){
out.println(pageIDList.get(i) + "<br>");
}
%>
Printscreen additional fields useradmin
How can I add some new User Properties to the CQ Users?
I found an solution but it don't work --> http://experience-aem.blogspot.ch/2014/01/aem-cq-56-extend-useradmin-add-new-user.html
I tried to manipulate in CRX the UserProperties.js with new Properties, I see them in useradmin but if I try to add the new propertie in Java Code (not via useradmin) I can save it without error, but the value is empty in useradmin.
And if I try to add some value via useradmin for the new propertie, all user gets the same value.
How can I add new User Properties, that I can set the Value via Java code like the standard properties.
user = userManager.createUser(username, password);
ValueFactory valueFactory = session.getValueFactory();
emailValue = valueFactory.createValue(email);
givennameValue = valueFactory.createValue(givenname);
nameValue = valueFactory.createValue(name);
//User class just accepts Value Object
user.setProperty("profile/" + UserProperties.EMAIL, emailValue);
user.setProperty("profile/" + UserProperties.FAMILY_NAME, nameValue);
user.setProperty("profile/" + UserProperties.GIVEN_NAME, givennameValue);
I found an solution.
Go to crx /libs/cq/security/widgets/source/widgets/security/UserProperties.js
add the fields you need in the items array of the user (Caution - there are items for user and items for groups in the same place)
in the loadRecord method of your JS, you have to add each new field to the "record" object
"items":[{
"xtype":"textfield",
"fieldLabel":CQ.I18n.getMessage("Mail"),
"anchor":"100%",
"vtype":"email",
"msgTarget":"under",
"name":"email"
},{
"xtype":"textfield",
"fieldLabel":CQ.I18n.getMessage("My Field"),
"anchor":"100%",
"msgTarget":"under",
"name":"myfield"
},{
"xtype":"textarea",
"fieldLabel":CQ.I18n.getMessage("About"),
"anchor":"100% -155",
"name":"aboutMe"
}],
loadRecord: function(rec) {
this.enableUserSaveButton(false);
this.enableGroupSaveButton(false);
var type = rec.get("type");
if (type=="user") {
this.activeForm = this.userForm;
this.hiddenForm = this.groupForm;
if (rec.id==CQ.security.UserProperties.ADMIN_ID) {
this.pwdButtons.each(function(bt) {bt.hide(); return true;} )
} else {
this.pwdButtons.each(function(bt) {bt.show(); return true;} )
}
} else {
this.activeForm = this.groupForm;
this.hiddenForm = this.userForm;
}
//is loading additional property from json and show it in formular
rec.data["myfield"] = rec.json["myfield"];
this.activeForm.getForm().loadRecord(rec);
In the java code you can then add the new properties via the "user" object to the new properties. Note that the properties are put into the subfolder "profile".
user.setProperty("profile/" + "myfield", myFieldValue);
Did you try the second approach, posted by "pedro" in the link you've posted?
It probably has to do with pushing the new field to the record:
http://experience-aem.blogspot.com/2014/01/aem-cq-56-extend-useradmin-add-new-user.html?showComment=1390804750445#c2823498719990547675
i hope this may helps you the file exist on http://[host name]:[port]/crx/de/index.jsp#/libs/cq/security/widgets/source/widgets/security/UserProperties.js
and you will have two major properties the first one is for the user this.userForm the other one is this.groupForm for groups.
I have two extjs treeStores, one is temp, another is working treeStore, which data interprets by treePanel. Code:
var treeStore = Ext.create('Ext.data.TreeStore',{
root: {
childrens:[],
expanded: true,
text: 'Services'
}
});
var tempStore = Ext.create('Ext.data.TreeStore',{
autoload: false,
proxy:{
type: 'ajax',
url: 'server.jsp',
reader: {
type:'json'
}
},
clearOnLoad: true,
listeners:{
load:{
fn: afterload
}
}
});
tempStore.load();
function afterload(store)
{
var rootTree = treeStore.getRootNode();
var copyChilds = Ext.clone(store.getRootNode().childNodes);
if(rootTree.hasChildNodes())
{
rootTree.removeAll(false);
}
for(var i=0;i<copyChilds.length;i++)
{
rootTree.appendChild(copyChilds[i]);
}
}
When tempStore.load method invoked, i'll send query on server, get data and then put it on another treeStore.p>
The tempStore reload every 3 seconds using TaskRunner:
var taskTree = {
run: reloadTree,
interval:3000 // 1 second
}
var runReloadTreePanel = Ext.create("Ext.util.TaskRunner");
I have checkbox, when i change the state of it into true the runner start works:
var checkbox = Ext.create('Ext.form.field.Checkbox',{
boxLabel: 'Refresh',
listeners :{
change: {
fn: changeCheckBoxState
}
}
});
function changeCheckBoxState(field,newValue,oldValue)
{
if(newValue)
{
runReloadTreePanel.start(taskTree);
}
else
{
runReloadTreePanel.stop(taskTree);
}
}
The problem:
After 2-3 minutes of start memory begin grows(3-4MB every 10seconds), because of method afterload content, which is copy data. Is it bug or i do something wrong?? Maybe the function removeAll(false) didn't remove all child data of treeStore root node? I used a lot of variants - clean dom model elements using Ext.select(..) + innerHtml, try to remove the records one by one using remove(..), but the memory grows.
My treepanel interprets all good, without unnecessary data.
Any ideas? Sorry if my english isn't good.(
i tried to change content of function load to this:
function load(store)
{
var childs = Ext.clone(store.getRootNode().childNodes);
Ext.select('div#treepanel-1032-body div table tbody tr[class^="x-grid-row"]').remove();
treepanel.setRootNode({
text: "Services",
expanded: true,
children: childs
});
I removed dom model object directly, removed from config properties of treepanel store, and set every 3 sec new root, but the memory grows, especially in IE. Any advices?
I need to create an automated process (preferably using Java) that will:
Open browser with specific url.
Login, using the username and password specified.
Follow one of the links on the page.
Refresh the browser.
Log out.
This is basically done to gather some statistics for analysis. Every time a user follows the link a bunch of data is generated for this particular user and saved in database. The thing I need to do is, using around 10 fake users, ping the page every 5-15 min.
Can you tink about simple way of doing that? There has to be an alternative to endless login-refresh-logout manual process...
Try Selenium.
It's not Java, but Javascript. You could do something like:
window.location = "<url>"
document.getElementById("username").value = "<email>";
document.getElementById("password").value = "<password>";
document.getElementById("login_box_button").click();
...
etc
With this kind of structure you can easily cover 1-3. Throw in some for loops for page refreshes and you're done.
Use HtmlUnit if you want
FAST
SIMPLE
java based web interaction/crawling.
For example: here is some simple code showing a bunch of output and an example of accessing all IMG elements of the loaded page.
public class HtmlUnitTest {
public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException {
final WebClient webClient = new WebClient();
final HtmlPage page = webClient.getPage("http://www.google.com");
System.out.println(page.getTitleText());
for (HtmlElement node : page.getHtmlElementDescendants()) {
if (node.getTagName().toUpperCase().equals("IMG")) {
System.out.println("NAME: " + node.getTagName());
System.out.println("WIDTH:" + node.getAttribute("width"));
System.out.println("HEIGHT:" + node.getAttribute("height"));
System.out.println("TEXT: " + node.asText());
System.out.println("XMl: " + node.asXml());
}
}
}
}
Example #2 Accessing named input fields and entering data/clicking:
final HtmlPage page = webClient.getPage("http://www.google.com");
HtmlElement inputField = page.getElementByName("q");
inputField.type("Example input");
HtmlElement btnG = page.getElementByName("btnG");
Page secondPage = btnG.click();
if (secondPage instanceof HtmlPage) {
System.out.println(page.getTitleText());
System.out.println(((HtmlPage)secondPage).getTitleText());
}
NB: You can use page.refresh() on any Page object.
You could use Jakarta JMeter