Thursday, September 9, 2010

Idea for navigating tags

Related Articles:
Tag, path and weighted tag

As mentioned in previous article, there are pros and cons for either the path or tag approaches in classifying information.

A problem with tags is that a single tag may associated with a lot of information that may or may not related to the specific interest of a particular search incidence. For example, a 'statistics' tag may associated with education, mathematics, physics, program ... etc. A single search of statistics becomes useless.

A possible approach is to make the search a tree/branch-liked search: Once a tag-of-interest is identified, the search program will create a list of data items that are tagged with the searching tag. The program will, then, collect all tags from these data items and try to present choices for the next tag selection and so on. By implement it right, tags can be considered a dynamic tree/path structure.

Monday, June 21, 2010

Android mothership tablet for Android phone

Here's an idea: Make Android tablet a mothership for Android phone. The connection between them is Wifi.

The tablet do not need camera, phone and GIS capabilities. But it can take advantage of those from the Android phone. You don't want to hold the hold tablet to take a picture anyway.

So how do you think?

Wednesday, June 2, 2010

The Chrome OS and future

Imaging a world with Chrome, almost all people will using programs on the cloud. Personal devices need very little computing power.

For well connected humans lived on Earth, this will not be a problem. For space traveler, powerful computer will be needed if the connection can't be established easily to the Earth.

How safe private data can be kept could be a problem. One can easily encrypts data and saved in the cloud. However, if you want process those data, you will need to get your data from the cloud, un-encrypted and process it. But if programs were also running in the cloud, you will need to send the data to the program to process. Then, how can you know no one in the processing cloud will not peek at your data? One of the reason the encryption works if because the user can run the encryption program at his own isolated machine, assuming his isolated machine is powerful enough to handle that.

Is it possible to package a cloud program that is isolated from inspection/peek?

The personal devices will become very cheep since it does not need a lot of computing power. At the same time, it will become pricey to get a decent computing machine since the demand is low. To be able to own such machine, people will need make money by providing cloud services. Otherwise, you will become a machine-less individual.

Tuesday, March 30, 2010

An introduction to R statistics

Recently, I spent some time learning the R environment. It take me a little while to Get-It. So I would like to describe the system in my way and hope it will help those brain that are wired like mine. There is no intention in covering the detail of R, but the basics.

The R environment data are objects that can have properties. But these objects do not have method. So, we can say they are more like a C structure than full featured objects. Also, R does not support the Object.Property or Object.method() syntax. Instead, the dot (.) is an allowable character for identifiers. Properties and methods are accessed through functions. So, the bottom line is R objects are like C structures. With this approach in mind, we can better understand the limitation of R and how it is constructed.

With this approach to objects, functions can be made to operate on multiple type of objects by knowing the type of the object. In R, properties for basic types are documented. The intrinsic properties are: mode, length and class. These properties can only be accessed through special functions: mode(), length() and class(). Other properties/attributes are accessed through attr(). The list of attributes can be viewed with attributes().

R supports the syntax of vector operation. For example, A*B can mean the multiplication of two vectors. This approach makes R an idea tool for expressing matrix operations and carrying out operations related to tabulated data, like those in the linear algebra and statistic survey.

Basic Data Type: Vector
The simplest R object type is the vector, which is an ordered list of components of the same kind. Component can be numeric, complex, character, logical, NA and others. Vectors have the mode property, where mode can be numeric, complex, character, logical and others. You use the mode() function to obtain access to the mode property. The other property of vectors is the length, or the number of components, and it can be accessed through the length() function. The names property is also supported. names property is a vector itself and gives each component a name. Components can be referred to by either integer indexes or their names.

Basic Data Type: Factor
Factor is an vector object with the levels property. Property levels is a vector of unique values of the original vector. This give those values an order.

R objects also have a property called class. The class of a vector is simply its mode. The class of a Factor is 'factor'. The class property can be accessed via class() function.

Basic Data Type: Array
Array is an vector object with dim property. The dim is a positive integer vector. The component of the dim vector specifies the size of each dimension. Matrix is an array of two dimensions. The mode of an array is the same as the mode of its component. The class of an array is 'array'. Array can be created by combining vectors or by setting a vector's dim property.

Basic Data Type: List
By combing objects of different type in an ordered list, we created a list object. List object can have a names property that is a vector of mode character and it gives each list-component a name. List objects have the class property set to 'list'.

Basic Data Type: data.frame
data.frame object is considered as an extension of list object with restrictions placed on the size of the list-component so that the data.frame resemble a table like structure with each column has the same number of values. These list-components can be vectors, factors, matrix or lists. data.frame have the class of 'data.frame'.

Useful Function/Operators
getwd(), setwd(), objects(), ls(), library()

Bgn:End (colon), c(), vector(), factor(), list(), data.frame(), matrix(), cbind(), rbind(), matrix()

as.vector(), as.factor() ...

  • [] return the same type
  • Vctr[ NdxVctr ], Vctr[ NmsVctr ], Vctr[ LgcVctr ] ...
  • Mtrx[ NdxVctr1 ][ NdxVctr2 ] ...
  • [[ Ndx ]] == $ return the component.
[], [[]], $

assign(), <-, ->, grep(), function(), tapply(),, is.vector(), summary(), names()
& (element-wise), | (element-wise), &&, ||

Control Structure
  • if (Exp1) Exp2 else Exp3
  • ifelse( LgcVctr, TrVctr, FlsVctr) return a vector with components from TrVctr and FlsVctr based on LgcVctr
  • for ( Ndx in Vctr) { Expr... }
  • while (Cndtn) { Expr ... }
  • break
  • Vrbl <- function (Arg1, Arg2, ...) { Expr ... }
The above provided enough info for the basic understanding of the R. For detail, please visit the R-Intro and the R-Reference.pdf.

Sunday, March 14, 2010

News feeds to China public

According to the news, Google could retreated from China as soon as the end of March, 2010. Recently, I read a news article that I feel it really worth reading for all the Chinese people including people lived in Taiwan and I begin to think how can I help feed news to Chinese lived inside China's Great Firewall.

I am sorry to say that "Bill Gates, I do not agree with you on that there are a lot of (technical) ways that Chinese people can reach the outside world". I believe they need help. One thing I hope can be done is to defeat the filtering of search result. Here are some of my idea.

1. Create sites with China-Communist Party permissible articles so that it can be indexed. We will then provide CAPTCHA and allow users to view the unfiltered result - possibly from Google.
2. Encourage everyone that care about this issue to setup various web sites so that it become extremely expensive for China-Communist to filter all these sites.

Comments welcome. I believe there are a lot of smart people that can provide even better ideas like dynamically register new sites from time to time.

Monday, February 15, 2010

Youtube Flash Video download with Firefox

Firefox add-on: Download Flash and Video is a very nice add-on.

Once you installed the add-on, you will see a downward arrow on the lower-right side of the browser window. If you are on a web page, clicking on the arrow will show a list of flash sources on the page. If you mouse over an item, the source on the web page will be highlighted. You can save the .swf file directly or you can start the video and re-click the arrow to download the .flv files.

Once you have the video saved, you will need a player to view the video.

Thursday, February 11, 2010

Census DataFerrett Install on other drive and Vista

On Other Drive
Census' stand-alone desktop DataFerrett can be downloaded from here. The current beta version for Windows (as of Feb. 8, 2010), when install, will automatically installed to C:\Program Files\.

Since my hard drive C: is almost full, I try to run the program from other drive. What I found out is that if you kept just the image and the messages sub-directory on C: and move everything else to other drive, the program will work fine.

When I download the program on Feb. 11, 2010 and installed on my Vista Home C: drive. I also run into problems. Every time I start the program, it prompt to upgrade to version 1.03.05_B 2009-05-15 from version 1.03.04_B 2006???. I follow the prompt and program reported completion of the upgrade and asked me to click OK and to restart the program. But when I start the program, it again prompt me to upgrade to the same version.

Finally, I rename the file version.txt and start the program. The program reports error but it does send me to the normal starting window. I run into some problem when I try to download data and create table. But I will have to check more carefully to see if it is caused by my other actions.

Well. On Vista, you need to right click the DataFerrett.exe and chose 'run as administrator'. This will download the new files into the \new sub-folder. After you click OK to dismiss the dialog, you need to right click on the file again and chose the 'run as administrator' again. This time, the old version of files will be copied into the \old sub-folder while the new files will be installed. The program should run fine from now on.

If you did copy the directory to other drive, the installation of new version is a snap. Somehow, the Vista is quite protective against the 'program files' directory.

Now I am a happy camper.

* I am an old fashion guy and I like to control the program I am running. I hate those program that automatically update themselves without any warning. If programs are going to upgrade themselves, they should notify the user, backup the old version and let the user decide if they want to upgrade or not. At least, they need allow the user to roll-back to old version - I think
Microsoft learned this well - updates could have side effects even if you are careful.