Weka and unknown datatype - java
After successfully uploading mysql database into weka and applying a simple query, when I press ok, I get:
Couldn't read from database:
Unknown data type: INT. Add entry in weka/expirement/DatabaseUtils.props.
if the type contains blanks, either escape them with a backslash or use underscores instead of blanks"
I've searched effortlessly for a solution but nothing works.
I tried adding int=5 in the DatabaseUtlis.props file
I tried int_unsigned=5
my databaseutils.props file looks like this
# Database settings for MySQL 3.23.x, 4.x
#
# General information on database access can be found here:
# https://waikato.github.io/weka-wiki/databases/
#
# url: http://www.mysql.com/
# jdbc: http://www.mysql.com/products/connector/j/
# author: Fracpete (fracpete at waikato dot ac dot nz)
# version: $Revision: 15255 $
# JDBC driver (comma-separated list)
jdbcDriver=org.gjt.mm.mysql.Driver
# database URL
jdbcURL=jdbc:mysql://zalpha-db.cei00brrdcsc.eu-west-1.rds.amazonaws.com/mqdata
# specific data types
# string, getString() = 0; --> nominal
# boolean, getBoolean() = 1; --> nominal
# double, getDouble() = 2; --> numeric
# byte, getByte() = 3; --> numeric
# short, getByte()= 4; --> numeric
# int, getInteger() = 5; --> numeric
# long, getLong() = 6; --> numeric
# float, getFloat() = 7; --> numeric
# date, getDate() = 8; --> date
# text, getString() = 9; --> string
# time, getTime() = 10; --> date
# timestamp, getTime() = 11; --> date
INT = 5
# other options
CREATE_DOUBLE=DOUBLE
CREATE_STRING=TEXT
CREATE_INT=INT
CREATE_DATE=DATETIME
DateFormat=yyyy-MM-dd HH:mm:ss
checkUpperCaseNames=false
checkLowerCaseNames=false
checkForTable=true
# All the reserved keywords for this database
# Based on the keywords listed at the following URL (2009-04-13):
# http://dev.mysql.com/doc/mysqld-version-reference/en/mysqld-version-reference-reservedwords-5-0.html
Keywords=\
ADD,\
ALL,\
ALTER,\
ANALYZE,\
AND,\
AS,\
ASC,\
ASENSITIVE,\
BEFORE,\
BETWEEN,\
BIGINT,\
BINARY,\
BLOB,\
BOTH,\
BY,\
CALL,\
CASCADE,\
CASE,\
CHANGE,\
CHAR,\
CHARACTER,\
CHECK,\
COLLATE,\
COLUMN,\
COLUMNS,\
CONDITION,\
CONNECTION,\
CONSTRAINT,\
CONTINUE,\
CONVERT,\
CREATE,\
CROSS,\
CURRENT_DATE,\
CURRENT_TIME,\
CURRENT_TIMESTAMP,\
CURRENT_USER,\
CURSOR,\
DATABASE,\
DATABASES,\
DAY_HOUR,\
DAY_MICROSECOND,\
DAY_MINUTE,\
DAY_SECOND,\
DEC,\
DECIMAL,\
DECLARE,\
DEFAULT,\
DELAYED,\
DELETE,\
DESC,\
DESCRIBE,\
DETERMINISTIC,\
DISTINCT,\
DISTINCTROW,\
DIV,\
DOUBLE,\
DROP,\
DUAL,\
EACH,\
ELSE,\
ELSEIF,\
ENCLOSED,\
ESCAPED,\
EXISTS,\
EXIT,\
EXPLAIN,\
FALSE,\
FETCH,\
FIELDS,\
FLOAT,\
FLOAT4,\
FLOAT8,\
FOR,\
FORCE,\
FOREIGN,\
FROM,\
FULLTEXT,\
GOTO,\
GRANT,\
GROUP,\
HAVING,\
HIGH_PRIORITY,\
HOUR_MICROSECOND,\
HOUR_MINUTE,\
HOUR_SECOND,\
IF,\
IGNORE,\
IN,\
INDEX,\
INFILE,\
INNER,\
INOUT,\
INSENSITIVE,\
INSERT,\
INT,\
INT1,\
INT2,\
INT3,\
INT4,\
INT8,\
INTEGER,\
INTERVAL,\
INTO,\
IS,\
ITERATE,\
JOIN,\
KEY,\
KEYS,\
KILL,\
LABEL,\
LEADING,\
LEAVE,\
LEFT,\
LIKE,\
LIMIT,\
LINES,\
LOAD,\
LOCALTIME,\
LOCALTIMESTAMP,\
LOCK,\
LONG,\
LONGBLOB,\
LONGTEXT,\
LOOP,\
LOW_PRIORITY,\
MATCH,\
MEDIUMBLOB,\
MEDIUMINT,\
MEDIUMTEXT,\
MIDDLEINT,\
MINUTE_MICROSECOND,\
MINUTE_SECOND,\
MOD,\
MODIFIES,\
NATURAL,\
NOT,\
NO_WRITE_TO_BINLOG,\
NULL,\
NUMERIC,\
ON,\
OPTIMIZE,\
OPTION,\
OPTIONALLY,\
OR,\
ORDER,\
OUT,\
OUTER,\
OUTFILE,\
PRECISION,\
PRIMARY,\
PRIVILEGES,\
PROCEDURE,\
PURGE,\
READ,\
READS,\
REAL,\
REFERENCES,\
REGEXP,\
RELEASE,\
RENAME,\
REPEAT,\
REPLACE,\
REQUIRE,\
RESTRICT,\
RETURN,\
REVOKE,\
RIGHT,\
RLIKE,\
SCHEMA,\
SCHEMAS,\
SECOND_MICROSECOND,\
SELECT,\
SENSITIVE,\
SEPARATOR,\
SET,\
SHOW,\
SMALLINT,\
SONAME,\
SPATIAL,\
SPECIFIC,\
SQL,\
SQLEXCEPTION,\
SQLSTATE,\
SQLWARNING,\
SQL_BIG_RESULT,\
SQL_CALC_FOUND_ROWS,\
SQL_SMALL_RESULT,\
SSL,\
STARTING,\
STRAIGHT_JOIN,\
TABLE,\
TABLES,\
TERMINATED,\
THEN,\
TINYBLOB,\
TINYINT,\
TINYTEXT,\
TO,\
TRAILING,\
TRIGGER,\
TRUE,\
UNDO,\
UNION,\
UNIQUE,\
UNLOCK,\
UNSIGNED,\
UPDATE,\
UPGRADE,\
USAGE,\
USE,\
USING,\
UTC_DATE,\
UTC_TIME,\
UTC_TIMESTAMP,\
VALUES,\
VARBINARY,\
VARCHAR,\
VARCHARACTER,\
VARYING,\
WHEN,\
WHERE,\
WHILE,\
WITH,\
WRITE,\
XOR,\
YEAR_MONTH,\
ZEROFILL
# The character to append to attribute names to avoid exceptions due to
# clashes between keywords and attribute names
KeywordsMaskChar=_
#flags for loading and saving instances using DatabaseLoader/Saver
nominalToStringLimit=50
idColumn=auto_generated_id
Anyone who's solved this issue or may know how to?
If that error message pops up, it means int datatype is not recognized. in order to fix this, you have to verify your databaseutils.prop file and add; INT=5 and INT_UNSIGNED=5. Note that you have to do that in the properties file and not the mysql type file.
Related
python kivy android app crashes after running apk on mobile device
I was trying to develop an android app using the kivy python framework. The program connects with a remote mysql database. A part of the code (registration and login page) was tested in pyCharm and found to be working perfectly. For converting to an android app, Ubuntu 19.10 OS running on Oracle VM VirtualBox was used. APK file was obtained by running command buildozer android debug. But on running the command buildozer android deploy run, the following output with error message comes and app crashes. List of devices attached ZX1PC222GV device Run on ZX1PC222GV Run '/home/nirmal/.buildozer/android/platform/android-sdk/platform-tools/adb shell am start -n org.test.kkfoodies/org.kivy.android.PythonActivity -a org.kivy.android.PythonActivity' Cwd /home/nirmal/.buildozer/android/platform Starting: Intent { act=org.kivy.android.PythonActivity cmp=org.test.kkfoodies/org.kivy.android.PythonActivity } Error type 3 Error: Activity class {org.test.kkfoodies/org.kivy.android.PythonActivity} does not exist. Application started Here is my buildozer.spec file [app] # (str) Title of your application title = KK Foodies # (str) Package name package.name = kkfoodies # (str) Package domain (needed for android/ios packaging) package.domain = org.test # (str) Source code where the main.py live source.dir = . # (list) Source files to include (let empty to include all the files) source.include_exts = py,png,jpg,kv,atlas # (list) List of inclusions using pattern matching #source.include_patterns = assets/*,images/*.png # (list) Source files to exclude (let empty to not exclude anything) #source.exclude_exts = spec # (list) List of directory to exclude (let empty to not exclude anything) #source.exclude_dirs = tests, bin # (list) List of exclusions using pattern matching #source.exclude_patterns = license,images/*/*.jpg # (str) Application versioning (method 1) version = 1.0 # (str) Application versioning (method 2) # version.regex = __version__ = ['"](.*)['"] # version.filename = %(source.dir)s/main.py # (list) Application requirements # comma separated e.g. requirements = sqlite3,kivy requirements = python3,kivy # (str) Custom source folders for requirements # Sets custom source for any requirements with recipes # requirements.source.kivy = ../../kivy # (list) Garden requirements #garden_requirements = # (str) Presplash of the application #presplash.filename = %(source.dir)s/data/presplash.png # (str) Icon of the application #icon.filename = %(source.dir)s/data/icon.png # (str) Supported orientation (one of landscape, sensorLandscape, portrait or all) orientation = portrait # (list) List of service to declare #services = NAME:ENTRYPOINT_TO_PY,NAME2:ENTRYPOINT2_TO_PY # # OSX Specific # # # author = © Copyright Info # change the major version of python used by the app osx.python_version = 3 # Kivy version to use osx.kivy_version = 1.9.1 # # Android specific # # (bool) Indicate if the application should be fullscreen or not fullscreen = 0 # (string) Presplash background color (for new android toolchain) # Supported formats are: #RRGGBB #AARRGGBB or one of the following names: # red, blue, green, black, white, gray, cyan, magenta, yellow, lightgray, # darkgray, grey, lightgrey, darkgrey, aqua, fuchsia, lime, maroon, navy, # olive, purple, silver, teal. #android.presplash_color = #FFFFFF # (list) Permissions #android.permissions = INTERNET # (int) Target Android API, should be as high as possible. #android.api = 27 # (int) Minimum API your APK will support. android.minapi = 21 # (int) Android SDK version to use #android.sdk = 20 # (str) Android NDK version to use #android.ndk = 17c # (int) Android NDK API to use. This is the minimum API your app will support, it should usually match android.minapi. #android.ndk_api = 21 # (bool) Use --private data storage (True) or --dir public storage (False) #android.private_storage = True # (str) Android NDK directory (if empty, it will be automatically downloaded.) #android.ndk_path = # (str) Android SDK directory (if empty, it will be automatically downloaded.) #android.sdk_path = # (str) ANT directory (if empty, it will be automatically downloaded.) #android.ant_path = # (bool) If True, then skip trying to update the Android sdk # This can be useful to avoid excess Internet downloads or save time # when an update is due and you just want to test/build your package # android.skip_update = False # (bool) If True, then automatically accept SDK license # agreements. This is intended for automation only. If set to False, # the default, you will be shown the license when first running # buildozer. # android.accept_sdk_license = False # (str) Android entry point, default is ok for Kivy-based app #android.entrypoint = org.renpy.android.PythonActivity # (str) Android app theme, default is ok for Kivy-based app # android.apptheme = "#android:style/Theme.NoTitleBar" # (list) Pattern to whitelist for the whole project #android.whitelist = # (str) Path to a custom whitelist file #android.whitelist_src = # (str) Path to a custom blacklist file #android.blacklist_src = # (list) List of Java .jar files to add to the libs so that pyjnius can access # their classes. Don't add jars that you do not need, since extra jars can slow # down the build process. Allows wildcards matching, for example: # OUYA-ODK/libs/*.jar #android.add_jars = foo.jar,bar.jar,path/to/more/*.jar # (list) List of Java files to add to the android project (can be java or a # directory containing the files) #android.add_src = # (list) Android AAR archives to add (currently works only with sdl2_gradle # bootstrap) #android.add_aars = # (list) Gradle dependencies to add (currently works only with sdl2_gradle # bootstrap) #android.gradle_dependencies = # (list) add java compile options # this can for example be necessary when importing certain java libraries using the 'android.gradle_dependencies' option # see https://developer.android.com/studio/write/java8-support for further information # android.add_compile_options = "sourceCompatibility = 1.8", "targetCompatibility = 1.8" # (list) Gradle repositories to add {can be necessary for some android.gradle_dependencies} # please enclose in double quotes # e.g. android.gradle_repositories = "maven { url 'https://kotlin.bintray.com/ktor' }" #android.add_gradle_repositories = # (list) packaging options to add # see https://google.github.io/android-gradle- dsl/current/com.android.build.gradle.internal.dsl.PackagingOptions.html # can be necessary to solve conflicts in gradle_dependencies # please enclose in double quotes # e.g. android.add_packaging_options = "exclude 'META- INF/common.kotlin_module'", "exclude 'META-INF/*.kotlin_module'" #android.add_gradle_repositories = # (list) Java classes to add as activities to the manifest. #android.add_activites = com.example.ExampleActivity # (str) OUYA Console category. Should be one of GAME or APP # If you leave this blank, OUYA support will not be enabled #android.ouya.category = GAME # (str) Filename of OUYA Console icon. It must be a 732x412 png image. #android.ouya.icon.filename = %(source.dir)s/data/ouya_icon.png # (str) XML file to include as an intent filters in <activity> tag #android.manifest.intent_filters = # (str) launchMode to set for the main activity #android.manifest.launch_mode = standard # (list) Android additional libraries to copy into libs/armeabi #android.add_libs_armeabi = libs/android/*.so #android.add_libs_armeabi_v7a = libs/android-v7/*.so #android.add_libs_arm64_v8a = libs/android-v8/*.so #android.add_libs_x86 = libs/android-x86/*.so #android.add_libs_mips = libs/android-mips/*.so # (bool) Indicate whether the screen should stay on # Don't forget to add the WAKE_LOCK permission if you set this to True #android.wakelock = False # (list) Android application meta-data to set (key=value format) #android.meta_data = # (list) Android library project to add (will be added in the # project.properties automatically.) #android.library_references = # (list) Android shared libraries which will be added to AndroidManifest.xml using <uses-library> tag #android.uses_library = # (str) Android logcat filters to use #android.logcat_filters = *:S python:D # (bool) Copy library instead of making a libpymodules.so #android.copy_libs = 1 # (str) The Android arch to build for, choices: armeabi-v7a, arm64-v8a, x86, x86_64 android.arch = armeabi-v7a # # Python for android (p4a) specific # # (str) python-for-android fork to use, defaults to upstream (kivy) #p4a.fork = kivy # (str) python-for-android branch to use, defaults to master #p4a.branch = master # (str) python-for-android git clone directory (if empty, it will be automatically cloned from github) #p4a.source_dir = # (str) The directory in which python-for-android should look for your own build recipes (if any) #p4a.local_recipes = # (str) Filename to the hook for p4a #p4a.hook = # (str) Bootstrap to use for android builds # p4a.bootstrap = sdl2 # (int) port number to specify an explicit --port= p4a argument (eg for bootstrap flask) #p4a.port = # # iOS specific # # (str) Path to a custom kivy-ios folder #ios.kivy_ios_dir = ../kivy-ios # Alternately, specify the URL and branch of a git checkout: ios.kivy_ios_url = https://github.com/kivy/kivy-ios ios.kivy_ios_branch = master # Another platform dependency: ios-deploy # Uncomment to use a custom checkout #ios.ios_deploy_dir = ../ios_deploy # Or specify URL and branch ios.ios_deploy_url = https://github.com/phonegap/ios-deploy ios.ios_deploy_branch = 1.7.0 # (str) Name of the certificate to use for signing the debug version # Get a list of available identities: buildozer ios list_identities #ios.codesign.debug = "iPhone Developer: <lastname> <firstname> (<hexstring>)" # (str) Name of the certificate to use for signing the release version #ios.codesign.release = %(ios.codesign.debug)s [buildozer] # (int) Log level (0 = error only, 1 = info, 2 = debug (with command output)) log_level = 2 # (int) Display warning if buildozer is run as root (0 = False, 1 = True) warn_on_root = 1 # (str) Path to build artifact storage, absolute or relative to spec file # build_dir = ./.buildozer # (str) Path to build output (i.e. .apk, .ipa) storage # bin_dir = ./bin # ----------------------------------------------------------------------------- # List as sections # # You can define all the "list" as [section:key]. # Each line will be considered as a option to the list. # Let's take [app] / source.exclude_patterns. # Instead of doing: # #[app] #source.exclude_patterns = license,data/audio/*.wav,data/images/original/* # # This can be translated into: # #[app:source.exclude_patterns] #license #data/audio/*.wav #data/images/original/* # # ----------------------------------------------------------------------------- # Profiles # # You can extend section / key with a profile # For example, you want to deploy a demo version of your application without # HD content. You could first change the title to add "(demo)" in the name # and extend the excluded directories to remove the HD content. # #[app#demo] #title = My Application (demo) # #[app:source.exclude_patterns#demo] #images/hd/* # # Then, invoke the command line with the "demo" profile: # #buildozer --profile demo android debug The python file is given below. from kivy.app import App from kivy.properties import ObjectProperty from kivy.uix.boxlayout import BoxLayout from kivy.uix.relativelayout import RelativeLayout from kivy.uix.scrollview import ScrollView from kivy.uix.label import Label from kivy.uix.button import Button from kivy.uix.togglebutton import ToggleButton from kivy.uix.checkbox import CheckBox from kivy.uix.spinner import Spinner from kivy.properties import NumericProperty #, ListProperty from kivy.uix.textinput import TextInput from kivy.uix.popup import Popup # from kivy.uix.bubble import Bubble from kivy.uix.image import Image from kivy.lang import Builder import MySQLdb import hashlib, binascii, os from kivy.uix.screenmanager import ScreenManager, Screen # from datetime import date, timedelta from kivy.uix.popup import Popup from datepicker import DatePicker # , CalendarWidget import base64 from functools import partial import textwrap # import numpy as np # import cv2 # import io # import PIL.Image #_imaging = PIL.Image.core # from PIL import Image # from PIL.Image import core as _imaging # import Image # import sys # import cStringIO # import timepicker # from kivy.garden.circulardatetimepicker import CircularTimePicker from kivy.core.window import Window # Window.clearcolor = (204/255, 1, 244/255, 0) class ScreenManagement(ScreenManager): pass myname = '' selldate = '' roomno = '' try: dbconnect = MySQLdb.connect("IP", "username", "password", "dbname") except (MySQLdb.Error) as e: # , MySQLdb.Warning print("Can't connect to database", e) exit() # return 0 # If Connection Is Successful # print("Connected") crsr = dbconnect.cursor() class ListHeader(Button): def __init__(self, **kwargs): super().__init__(**kwargs) # bcolor = ListProperty([1, 1, 1, 1]) class ListCell(Label): def __init__(self, **kwargs): super().__init__(**kwargs) # bcolor = ListProperty([1, 1, 1, 1]) class DbCon: def __init__(self, **kwargs): super().__init__(**kwargs) def get_row(self, uname): query = "SELECT username, passwordh, emp_no, full_name, room_no, authorisation_flag FROM residents WHERE username = '%s" % uname + "'" # query = "SELECT userid, pass, mob, fname, lname, mailid FROM login_data WHERE userid = '%s" % uname + "'" print(query) crsr.execute(query) return crsr.fetchone() def add_row(self, username, password, empnum, fullname, roomnum): hashedpwd = self.hash_password(password) print(hashedpwd) sqlquery = "INSERT INTO residents (username, passwordh, emp_no, full_name, room_no) VALUES (%s, %s, %s, %s, %s)" insert_values = (username, hashedpwd, empnum, fullname, roomnum) crsr.execute(sqlquery, insert_values) dbconnect.commit() print(crsr.rowcount, " record inserted.") def hash_password(self, password): """Hash a password for storing.""" salt = hashlib.sha256(os.urandom(60)).hexdigest().encode('ascii') pwdhash = hashlib.pbkdf2_hmac('sha512', password.encode('utf-8'), salt, 10000) pwdhash = binascii.hexlify(pwdhash) return (salt + pwdhash).decode('ascii') class LoginWindow(Screen): def __init__(self, **kwargs): super().__init__(**kwargs) self.db = DbCon() def validate_user(self): user = self.ids.username_field pwd = self.ids.pwd_field info = self.ids.info global myname, roomno uname = user.text passw = pwd.text if uname == '' or passw == '': info.text = '[color=#FF0000]username and/ or password required[/color]' else: row = self.db.get_row(uname) if row == '': info.text = '[color=#FF0000]Invalid Username and/or Password[/color]' else: stored_passwd = row[1] myname = row[3] roomno = row[4] print(stored_passwd) print(passw) if self.verify_password(stored_passwd, passw): # if stored_passwd == passw: if row[5] == 1: info.text = '[color=#00FF00]Logged In successfully!!![/color]' self.parent.current = 'menu' else: info.text = '[color=#FF0000]Sorry, you are not authorised. Please collect authorisation details from administrator.[/color]' else: info.text = '[color=#FF0000]Incorrect Password[/color]' def verify_password(self, stored_password, provided_password): """Verify a stored password against one provided by user""" salt = stored_password[:64] stored_password = stored_password[64:] pwdhash = hashlib.pbkdf2_hmac('sha512', provided_password.encode('utf-8'), salt.encode('ascii'), 10000) pwdhash = binascii.hexlify(pwdhash).decode('ascii') print(pwdhash) return pwdhash == stored_password class RegisterWindow(Screen): def __init__(self, **kwargs): super().__init__(**kwargs) self.db2 = DbCon() def register_user(self): user = self.ids.username_field pwd = self.ids.pwd_field rpt_pwd = self.ids.pwd_field_rpt fname = self.ids.full_name empnum = self.ids.emp_no roomnum = self.ids.room_no info = self.ids.info uname = user.text passw = pwd.text rpassw = rpt_pwd.text name = fname.text enum = empnum.text rnum = roomnum.text unameokflag = False passwokflag = False nameokflag = False enumokflag = False rnumokflag = False info.text = '' if uname == '': info.text += '[color=#FF0000]username required[/color]' else: unameokflag = True if passw == '': info.text += '\n[color=#FF0000]password required[/color]' if rpassw == '': info.text += '\n[color=#FF0000]repeat password required[/color]' if name == '': info.text += '\n[color=#FF0000]full name required[/color]' else: nameokflag = True if enum == '': info.text += '\n[color=#FF0000]mobile required[/color]' elif len(enum) != 10 or enum.isdigit() != True: info.text += '\n[color=#FF0000]not a valid mobile number[/color]' else: enumokflag = True if rnum == '': info.text += '\n[color=#FF0000]room number required[/color]' else: rnumokflag = True if passw != '' and rpassw != '' and passw != rpassw: info.text += '\n[color=#FF0000]passwords do not match[/color]' else: passwokflag = True if unameokflag and passwokflag and nameokflag and enumokflag and rnumokflag: self.db2.add_row(uname, passw, enum, name, rnum) info.text += '\n[color=#FF0000]resident details successfully inserted[/color]' class FoodiesApp(App): def build(self): return ScreenManagement() if __name__=="__main__": sa = FoodiesApp() sa.run() Kindly help me solve the issue.
Let me answer my own question. The problem is resolved. The error "Error type 3 Error: Activity class {org.test.kkfoodies/org.kivy.android.PythonActivity} does not exist." was causing utter confusion. This error was coming because the app was deployed on an android mobile phone where the minimum api requirement is not satisfied. (android.minapi = 21 option in buildozer spec file). Changing the option value to less than 21 also did not help as the buildozer version was not supporting deployment on older APIs. The app was then tested on android phone which met the minimum api requirement. But the python library mysqldb was causing issues. mysqldb option was added in the requirements attribute and the recipe named mysqldb was tried. But, both the attempts failed. Finally, the source code file main.py was modified -- mysql.connector library was used in place of MySQLdb. Also mysql_connector option was added in the requirements attribute of buildozer.spec file. That did the trick. The app is working smoothly on android phones now (Android version Lollipop and later). I thank every one of you for the help extended.
Dataset api of Spark giving different result as compare to Dataframe
I am using Spark 2.1 and having one hive table with orc format, following is the schema. col_name data_type tuid string puid string ts string dt string source string peer string # Partition Information # col_name data_type dt string source string peer string # Detailed Table Information Database: test Owner: test Create Time: Tue Nov 22 15:25:53 GMT 2016 Last Access Time: Thu Jan 01 00:00:00 GMT 1970 Location: hdfs://apps/hive/warehouse/nis.db/dmp_puid_tuid Table Type: MANAGED Table Parameters: transient_lastDdlTime 1479828353 SORTBUCKETCOLSPREFIX TRUE # Storage Information SerDe Library: org.apache.hadoop.hive.ql.io.orc.OrcSerde InputFormat: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat Compressed: No Storage Desc Parameters: serialization.format 1 When i am applying filter on top of this table using partition column, its working fine and only reading specific partitions. val puid = spark.read.table("nis.dmp_puid_tuid") .as(Encoders.bean(classOf[DmpPuidTuid])) .filter( """peer = "AggregateKnowledge" and dt = "20170403"""") and this is my physical plan for this query == Physical Plan == HiveTableScan [tuid#1025, puid#1026, ts#1027, dt#1022, source#1023, peer#1024], MetastoreRelation nis, dmp_puid_tuid, [isnotnull(peer#1024), isnotnull(dt#1022), (peer#1024 = AggregateKnowledge), (dt#1022 = 20170403)] but when i am using below code, its reading entire data into spark val puid = spark.read.table("nis.dmp_puid_tuid") .as(Encoders.bean(classOf[DmpPuidTuid])) .filter( tp => tp.getPeer().equals("AggregateKnowledge") && Integer.valueOf(tp.getDt()) >= 20170403) Physical plan for above dataframe == Physical Plan == *Filter <function1>.apply +- HiveTableScan [tuid#1058, puid#1059, ts#1060, dt#1055, source#1056, peer#1057], MetastoreRelation nis, dmp_puid_tuid Note :- DmpPuidTuid is java bean class
When you pass a Scala function to filter, you prevent the Spark optimizer from seeing which columns of the dataset are actually used (because the optimizer does not try to look inside the compiled code of the function). If you pass a column expression, such as col("peer") === "AggregateKnowledge" && col("dt").cast(IntegerType) >= 20170403 then the optimizer will be able to see which columns are actually required and adjust the plan accordingly.
WEKA: java.lang.IllegalArgumentException: Unknown data type: int4
I'm trying to connect my postgresql database with WEKA in Eclipse. When i run the main I get this exception: Exception in thread "main" java.lang.IllegalArgumentException: Unknown data type: int4. Add entry in weka/experiment/DatabaseUtils.props. If the type contains blanks, either escape them with a backslash or use underscores instead of blanks. I downloaded WEKA 3-6-14 I unzipped the weka.jar I changed the lines of the jdbc url and the jdbc driver in the file weka/experiment/DatabaseUtils.props.postgresql I added the data types that I use in my database I renamed the DatabaseUtils.props.postgresql file to DatabaseUtils.props This is a segment of my DatabaseUtils.props.postgresql file: # Database settings for PostgreSQL 7.4 # # General information on database access can be found here: # http://weka.wikispaces.com/Databases # # url: http://www.postgresql.org/ # jdbc: http://jdbc.postgresql.org/ # author: Fracpete (fracpete at waikato dot ac dot nz) # version: $Revision: 11887 $ # JDBC driver (comma-separated list) jdbcDriver=org.postgresql.Driver # database URL jdbcURL=jdbc:postgresql://localhost:5432/datamining # specific data types string, getString() = 0; --> nominal boolean, getBoolean() = 1; --> nominal double, getDouble() = 2; --> numeric # byte, getByte() = 3; --> numeric # short, getByte()= 4; --> numeric int, getInteger() = 5; --> numeric # long, getLong() = 6; --> numeric # float, getFloat() = 7; --> numeric # date, getDate() = 8; --> date # text, getString() = 9; --> string # time, getTime() = 10; --> date # timestamp, getTime() = 11; --> date # PostgreSQL data types to Java classes information can be found at: # http://www.postgresql.org/message-id/AANLkTinsk4rwT7v-751bwQkgTN1rkA=8uE-jk69nape-#mail.gmail.com text=0 boolean=1 double=2 int4=5 This is the code in my main method: InstanceQuery query = new InstanceQuery(); query.setUsername("username"); query.setPassword("password"); query.setQuery("SELECT * FROM database ORDER BY \"id\""); Instances data = query.retrieveInstances(); Can someone please tell me why I'm still getting the exception?
Try to add getter to int4 in DatabaseUtils.props.postgresql
I changed my DatabaseUtils.props code to: # Database settings for PostgreSQL 7.4 # # General information on database access can be found here: # http://weka.wikispaces.com/Databases # # url: http://www.postgresql.org/ # jdbc: http://jdbc.postgresql.org/ # author: Fracpete (fracpete at waikato dot ac dot nz) # version: $Revision: 11885 $ # JDBC driver (comma-separated list) jdbcDriver=org.postgresql.Driver # database URL jdbcURL=jdbc:postgresql://localhost:5432/datamining # specific data types # string, getString() = 0; --> nominal # boolean, getBoolean() = 1; --> nominal # double, getDouble() = 2; --> numeric # byte, getByte() = 3; --> numeric # short, getByte()= 4; --> numeric # int, getInteger() = 5; --> numeric # long, getLong() = 6; --> numeric # float, getFloat() = 7; --> numeric # date, getDate() = 8; --> date # text, getString() = 9; --> string # time, getTime() = 10; --> date # timestamp, getTime() = 11; --> date # PostgreSQL data types to Java classes information can be found at: # http://www.postgresql.org/message-id/AANLkTinsk4rwT7v-751bwQkgTN1rkA=8uE-jk69nape-#mail.gmail.com varchar=0 text=0 float4=7 float8=2 int4=5 oid=5 timestamp=8 date=8 bool=1 int2=5 int8=2 numeric=2 bpchar=9 Now it is working. You should also check if the java version of your eclipse is the same as the one of the driver.jar. Otherwise you will get this exception: java.lang.NoClassDefFoundError: java/time/temporal/TemporalField
find the last match with java/regex
I have a dynamic text that contains "font-family", for example: style="font-family: "Calibri","sans-serif"; font-size:11pt"; And I want to remove all the font-family element. I was using this code patron = Pattern.compile("font-family:(.*?);"); encaja = patron.matcher(cadena); nueva = encaja.replaceAll(""); But it remove in a way that isn't useful for me: style="Calibri","sans-serif"; font-size: 11pt;" What I want is: style=" font-size: 11pt;" I also tried using this pattern font-family:[^(&.*;)]*?; But it doesn't work. Can you help me? Thanks EDIT More case examples: in: style="font-size:15px; font-family:Arial; mso-ascii-theme-font: minor-latin; " output: style="font-size:15px; mso-ascii-theme-font: minor-latin;" in: style="font-family:Arial,Aás;; font-size:11pt; mso-fareast-mso-fareast-theme-font: minor-latin;" output:style="font-size:11pt; mso-fareast-mso-fareast-theme-font: minor-latin;"
You can use this: String result = yourstr.replaceAll("(?i)font-family:(?>[^;&\"]++|&(?>quot|ntilde);)*(?>;\\s*+|(?=\"))", ""); pattern description: (?i) # make the pattern case-insensitive font-family: (?> # open an atomic group [^;&\"]++ # all characters except ; & and " one or more times (possessive) | # OR & # literal & (?> # put the different possibilities here quot | ntilde ) ; # literal ; )* # repeat the atomic group zero or more times (?> ;\\s*+ # literal ; and trailing spaces | (?=\") # followed by " (last value of the attribute without trailing ; ) ) Another but less safer way (IMO): skip all letters that are between a & and a ; : String result = yourstr.replaceAll("(?i)font-family:(?>[^;&\"]++|&[a-z]++;)*(?>;\\s*+|(?=\"))", "");
Try this: newstr = str.replaceFirst("font-family:\s?([^\s]+)", "");
How to configure the on your server spotlight to give the same output as the spotlight demo on the web?
I am using the run on your server spotlight. but couldn't configure the properties file so it will output the same output as the demo does. here is the demo screen shot shows the parameters it uses: which are look like this (from the file server.properties which is locally on my machine): org.dbpedia.spotlight.web.rest.uri = http://localhost:2222/rest org.dbpedia.spotlight.index.dir = data/index org.dbpedia.spotlight.spot.dictionary = data/spotter.dict jcs.default.cacheattributes.MaxObjects = 5000 org.dbpedia.spotlight.tagging.hmm = data/pos-en-general-brown.HiddenMarkovModel org.dbpedia.spotlight.sparql.endpoint = http://dbpedia.org/sparql org.dbpedia.spotlight.sparql.graph = http://dbpedia.org # Configurations for the CoOccurrenceBasedSelector # From: http://spotlight.dbpedia.org/download/release-0.5/spot_selector.tgz # org.dbpedia.spotlight.spot.cooccurrence.datasource = ukwac # org.dbpedia.spotlight.spot.cooccurrence.database.jdbcdriver = org.hsqldb.jdbcDriver # org.dbpedia.spotlight.spot.cooccurrence.database.connector = jdbc:hsqldb:file:data/spotsel/ukwac_candidate;shutdown=true&readonly=true # org.dbpedia.spotlight.spot.cooccurrence.database.user = sa # org.dbpedia.spotlight.spot.cooccurrence.database.password = # org.dbpedia.spotlight.spot.cooccurrence.classifier.unigram = data/spotsel/ukwac_unigram.model # org.dbpedia.spotlight.spot.cooccurrence.classifier.ngram = data/spotsel/ukwac_ngram.model # Other possible values: AtLeastOneNounSelector,CoOccurrenceBasedSelector,NESpotter org.dbpedia.spotlight.spot.spotters = LingPipeSpotter # org.dbpedia.spotlight.spot.opennlp.dir = opennlp # Info for context searcher org.dbpedia.spotlight.language = English org.dbpedia.spotlight.lucene.analyzer = SnowballAnalyzer # Choose between jdbc or lucene for DBpedia Resource creation. Also, if the jdbc throws an error, lucene will be used. # org.dbpedia.spotlight.core.database = jdbc # org.dbpedia.spotlight.core.database.jdbcdriver = org.hsqldb.jdbcDriver # org.dbpedia.spotlight.core.database.connector = jdbc:hsqldb:file:data/database/spotlight-db;shutdown=true&readonly=true # org.dbpedia.spotlight.core.database.user = sa # org.dbpedia.spotlight.core.database.password = # List of disambiguators to load: Document,Occurrences,CuttingEdge,Default org.dbpedia.spotlight.disambiguate.disambiguators = Default,Document # From http://spotlight.dbpedia.org/download/release-0.5/candidate-index-full.tgz # org.dbpedia.spotlight.candidateMap.dir = /fastdata/spotlight/3.7/candidateIndexTitRedDis
The quickstart comes with tiny versions of index and spotter.dict. If you want the same results as our demo webserver, you need to download the larger files with several gigabytes. You can either overwrite your index and spotter.dict, our change the config to point to the new files. See http://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki/Downloads