WelcomeWelcome | FAQFAQ | DownloadsDownloads | WikiWiki

Author Topic: Extend .info file  (Read 36528 times)

Offline roberts

  • Retired Admins
  • Hero Member
  • *****
  • Posts: 7361
  • Founder Emeritus
Re: Extend .info file
« Reply #60 on: February 12, 2012, 02:51:44 PM »
I thought it was headed in a good direction with bmarkus' suggestion of tags and with Rich's suggestion of a wiki page with recommendations of tag sequences. We should not go overbaord with too many words, a single line in the info file should suffice.
10+ Years Contributing to Linux Open Source Projects.

Offline vinnie

  • Hero Member
  • *****
  • Posts: 1187
  • HandMace informatic works
Re: Extend .info file
« Reply #61 on: February 12, 2012, 04:25:28 PM »
... what I meant is that appbrowser keyword search will be revisited, if and when there is actual data to be processed, i.e,, the "new field" exists and is populated in the repo.

Fine by me, just tell me the name and the place in .info where to put that line and I begin to use it.

However, it may take some time before the occurrence of the data is relevant and since we decides to take the big step will undoubtedly many applications remain excluded.

Is there some reason that escapes me why this solution is not feasible?

Quote
link "keyword" search in a new field in .info and in first line of description for the time required for the info are all adequate
« Last Edit: February 12, 2012, 04:27:21 PM by vinnie »

Offline bmarkus

  • Administrator
  • Hero Member
  • *****
  • Posts: 7183
    • My Community Forum
Re: Extend .info file
« Reply #62 on: February 13, 2012, 02:27:20 AM »
My original proposal was discussed widely and deeply, it is the time to stop talking and implement. Here is the action plan.

1) Keep description line one line, use for short verbal description of package, if possible using text from upstream. Do not mix with keywords.

2) Add a single line Tags: line as below. While it is a free content line, to keep consistency and usability use tags according to guidelines and proposed tags published in WIKI.

Quote
Title:          unixcw.tcz
Description:    Morse code library and applications
Version:        3.0.1
Author:         Simon Baldwin, Kamal Mostafa, Kamil Ignacak
Original-site:  http://sourceforge.net/projects/unixcw/
Copying-policy: GPL v2
Size:           61k
Extension_by:   bmarkus
Tags:           CLI HAM RADIO CW
Comments:       Binaries only
                ----
                Compiled for TC 4.x
                ----
                PPI compatible
Change-log:     ----
Current:        2012/02/03 First version, 3.0.1

3) Rich will create the WIKI page of tags and guidelines (thanks!)

4) Extension makers start using it in next regular updates or submitting nemw extinsions

5) Robert, please add searc in Tags: line to AppBrowser and ScmBrowser in next RC to get it implemented in final 4.3 To avois the chicken and eggs syndrome please do not postpone it till it will be used, otherwise no one will add tags if search not implemented.

Thanks to everybody contributing :)
Béla
Ham Radio callsign: HA5DI

"Amateur Radio: The First Technology-Based Social Network."

Offline vinnie

  • Hero Member
  • *****
  • Posts: 1187
  • HandMace informatic works
Re: Extend .info file
« Reply #63 on: February 13, 2012, 12:05:50 PM »
This is keyword occurrence in (all) description field for rich, is the best I could do,
« Last Edit: February 13, 2012, 12:09:34 PM by vinnie »

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11725
Re: Extend .info file
« Reply #64 on: February 14, 2012, 04:29:33 PM »
Hi bmarkus
The question that has not yet been answered is whether it is acceptable that the first keyword in the field be used
as a primary index. Using  unixcw.tcz  as an example, I have added  hamradio  to the audio category as well as
the video category should a SSTV application be added. While I have no objection to adding a  hamradio  category,
this seemed like a good way to include it, as I could not find many applications in the repository. So search would
look like this:
Code: [Select]
audio hamradioThis way the search function would only check the remaining tags if the first one is  audio  which would mean
faster searches.  I would also like to hear roberts opinion on this.
Quote
While it is a free content line, to keep consistency and usability use tags according to guidelines and proposed tags published in WIKI.
And that is exactly the point. By starting with a suggested list of standard tags, the search returns relevant results.
Taking the previous example a little further, a ham adding an extension would still be able to add terms like
cw, ssb, morse, etc. to their tags field, since these are specialized terms that a ham would be aware of. Yet a
novice not familiar with those terms could still find out what ham related extensions are available.

I've started cleaning up my list as well as looking at the WIKI. Reading the WIKI syntax page and using the WIKI
playground I'm getting a handle on how to do this. Any suggestions and warnings on editing the WIKI would be
appreciated.

@vinnie: Thanks for the list.
Code: [Select]
tc@box:~$ wc -l Caseinsensitive_occurrences
3957 Caseinsensitive_occurrences
Shouldn't take to long to get through that. :)

Offline roberts

  • Retired Admins
  • Hero Member
  • *****
  • Posts: 7361
  • Founder Emeritus
Re: Extend .info file
« Reply #65 on: February 14, 2012, 04:38:28 PM »
@Rich, I am in agreement with you. It is what I had described as tag sequences.
It will result in faster searches.

We don't need to hold up the release of 4.3 for this. Most of the coding is done on the server side.
As soon as I see even a sampling of data, I will begin to revisit the server side code.

Hopefully this will come together for 4.4 release. But in the meantime we can test sample data and use cases.

Looking forward to the Wiki item followed by some sample items posted in the repository.
10+ Years Contributing to Linux Open Source Projects.

Offline Jason W

  • Retired Admins
  • Hero Member
  • *****
  • Posts: 9730
Re: Extend .info file
« Reply #66 on: February 14, 2012, 10:34:32 PM »
I have done and uploaded this for the mozilla web browsers, ntfs-3g, gtk2, and a few others.

Though we would normally wait on each extension maker to edit his info files, we have seen that this is not going to happen quickly.  I personally see no issue with a volunteer or a group of volunteers editing and submitting info files with Tag: fields to jump start the process.  If an extension maker disagrees with a tag in his info file, then he can edit and send in what he likes.

Offline vinnie

  • Hero Member
  • *****
  • Posts: 1187
  • HandMace informatic works
Re: Extend .info file
« Reply #67 on: February 15, 2012, 02:09:23 AM »
Quote
This way the search function would only check the remaining tags if the first one is  audio  which would mean
faster searches.  I would also like to hear roberts opinion on this.

And if in case the keyword that comes to mind is not the first, research continues or stop?
I think there may be many ambiguities concerning the keywords most representative.
For example jason has put "windows" as the first word of ntfs-3g and "web" for firefox, I would put "file" and "browser" but is not that there is a more reasonable choice.

Quote
Shouldn't take to long to get through that. :)
not so much because it is one word per line, however, with search function (or grep) you may use it as a reference when you have doubt about which word is used more :)


Quote
I personally see no issue with a volunteer or a group of volunteers editing and submitting info files with Tag: fields to jump start the process.  If an extension maker disagrees with a tag in his info file, then he can edit and send in what he likes

+1 and also an equal number of people could make a more uniform work

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11725
Re: Extend .info file
« Reply #68 on: February 15, 2012, 05:21:59 PM »
Hi everyone,
I've written a small WIKI page that shows a suggested way of setting up and searching for tags. After looking at it,
if anyone feels:
1. A category needs to be added
2. A function needs to be added to a category
3. A function needs to be removed a category
4. An extension cannot be categorized with the list of categories and functions
Please speak up, make suggestions, or give examples. Keep in mind, the table presented is intended to perform
a searches for a specific type of application, not every feature it may contain.
I've included a handful of examples at the end of the page.

     http://wiki.tinycorelinux.net/wiki:finding_applications

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11725
Re: Extend .info file
« Reply #69 on: February 16, 2012, 01:59:40 AM »
Hi vinnie
Quote
And if in case the keyword that comes to mind is not the first, research continues or stop?
If the first keyword does not match the search moves on to the next extension until it finds one where the first
keyword matches. It then compares the remaining keywords in your query with those in the applications tag field.
Quote
I think there may be many ambiguities concerning the keywords most representative.
I believe this where you and I differ in opinion. The point of the WIKI page is to try to establish a  BASE  set of
keywords capable of describing an applications purpose. Just being able to search for "web browser" or
"audio player" and receiving only relevant applications is big step in the right direction. The list is intentionally
kept short and as generic as possible to minimize being ambiguous. While using the most common keywords
and simply searching the info files may seem like a good idea on the surface, it will miss extensions not containing
any keywords that are relevant. Not to mention if for example you want a list of web browsers, you will also get back
every extension that mentions web browser including cups1311.tcz.

Quote
For example jason has put "windows" as the first word of ntfs-3g and "web" for firefox, I would put "file" and "browser" but is not that there is a more reasonable choice.
I guess I should have gotten that WIKI page up faster.


Offline vinnie

  • Hero Member
  • *****
  • Posts: 1187
  • HandMace informatic works
Re: Extend .info file
« Reply #70 on: February 17, 2012, 09:35:14 AM »
Now i doing info file of one package:
- http://unpaper.berlios.de/
I looked at the tags to wiki in search of tag utility and i found category "fileutility",
What happens if I search only "utility" keyword?
And if i search "scanner" "image" "utility" (in this sequence) ?

I'll clarify the question:
we suppose a repository that contains 3 total tags ("A" "BC" "DEF")
If I search keyword A must come to me only show packages that contain A
If I search E and A I have to find only those packets that contain A + CDE (and not only A or only CDE)

I think that the research work in this way (should be the logical "and" but looking also partial word) it is ok, otherwise I'm in doubt about which keywords to put (for now I put it without thinking, after I replace just in time)

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11725
Re: Extend .info file
« Reply #71 on: February 17, 2012, 12:04:39 PM »
Hi vinnie
First of all, thank you for providing a solid working example. That was the purpose of reply #68 and I hope a few
more people will participate.
While the  unpaper  program basically does  "post-processing"  of data, most people won't think of it in those terms.
How about if the tag  enhancement  were added to the printer functions? Then your tag field could read:
printer scanner enhancement  or  printer enhancement scanner  as most people would probably associate this
function with a scanner anyway. While a search on  printer enhancement  or  printer scanner  would still return
your extension, searching on all three tags would narrow the list further.
As far as partial matches are concerned, my preference would be to match from the beginning of the tag so that
the person doing the search could type enough characters to uniquely identify it, though the complete tags should
still be used in the info file. I don't like the idea of word inside of word matches as that may result in undesirable
matches.
Yes, searches should definitely be a logical  AND.
Come to think of it, maybe  data  is the wrong place for the  ocr  tag, printer would make more sense.

 

Offline vinnie

  • Hero Member
  • *****
  • Posts: 1187
  • HandMace informatic works
Re: Extend .info file
« Reply #72 on: February 18, 2012, 02:49:23 AM »
I approximately agree with you, only one thing I disagree with your thinking, research of partial word.
continue with an example.

hypotheses tag of a package: fileutility archive packer compress
hypotheses of research: "archive "utility"
Result: the first word is present, and therefore the package is listed, the second word is not present and if the search does not include partial words the package is not listed!

in essence I believe that includes the search for partial words ruin a little the results but that does not include it would do much more damage.
I think that the optimization has already taken place (and sufficient) limiting the use of keywords due to the field tags

When you do a search I think a person should be able to travel with her ​​intuition, not to be aware that there are limits which bound, because the majority of cases people will not be documented on how to do the research.

In italian we say "meglio abbondare che deficere" which in English translates same as "better too much than not enough"  ;)

Offline Rich

  • Administrator
  • Hero Member
  • *****
  • Posts: 11725
Re: Extend .info file
« Reply #73 on: February 18, 2012, 03:13:24 AM »
Hi vinnie
While I feel my reasoning on partial matches is sound, It's only an opinion. I suspect roberts will make the final
decision on that.
I get the feeling you are still not grasping that the first tag is used as an index and must come from the CATEGORIES
column.
Quote
When you do a search I think a person should be able to travel with her ​​intuition, not to be aware that there are limits which bound, because the majority of cases people will not be documented on how to do the research.
That's why the WIKI page is there, to guide people when searching for an extension. If necessary, a link could be
added to the Tinycore website to make its existence more obvious.

Offline coreplayer2

  • Hero Member
  • *****
  • Posts: 3020
Re: Extend .info file
« Reply #74 on: February 18, 2012, 04:16:48 AM »
Personally I don't see the logic in cutting keywords into categories, or putting emphasis on the first keyword.  all and any keywords need only to carry the same weight to be effective.  I understand the intentions but why make it more complex than it needs to be?

Just my opinion is all..

A wiki to give examples of keywords is all that is necessary to get things moving in the right direction.  Extension creators are going to use there own keywords anyhow.
« Last Edit: February 18, 2012, 04:22:46 AM by coreplayer2 »