You are here

Agreguesi i feed

Nathan Haines: Announcing the Ubuntu 18.10 Free Culture Showcase winners

Planet Ubuntu - Pre, 28/09/2018 - 9:00pd

October approaches, and Ubuntu marches steadly along the road from one LTS to another. Ubuntu 18.10 is another step in Ubuntu’s future. And now it’s time to unveil a small part of that change: the community wallpapers to be included in Ubuntu 18.10!

Every cycle, talented artists around the world create media and release it under licenses that encourage sharing and adaptation. This cycle we had some amazing images submitted to the Ubuntu 18.10 Free Culture Showcase photo pool on Flickr, where all eligible submissions can be found. The competition was fierce; narrowing down the options to the final selections was painful!

But there can be only 12, and the final images that will be included in Ubuntu 18.10 are:

A big congratulations to the winners, and thanks to everyone who submitted a wallpaper. You can find these wallpapers (along with dozens of other stunning wallpapers) today at the links above, or in your desktop wallpaper list after you upgrade to or install Ubuntu 18.10 on October 18th.

Ubuntu Studio: Ubuntu Studio 18.10 (Cosmic Cuttlefish) Beta released

Planet Ubuntu - Pre, 28/09/2018 - 7:09pd
The Ubuntu Studio team is pleased to announce the final beta release of Ubuntu Studio 18.10 Cosmic Cuttlefish. While this beta is reasonably free of any showstopper CD build or installer bugs, you may find some bugs within. This image is, however, reasonably representative of what you will find when Ubuntu Studio 18.10 is released […]

Ubuntu MATE: Ubuntu MATE 18.10 Beta

Planet Ubuntu - Pre, 28/09/2018 - 1:30pd

Ubuntu MATE 18.10 is a modest, yet strategic, upgrade over our 18.04 release. If you want bug fixes and improved hardware support then 18.10 is for you. For those who prefer staying on the LTS then everything in this 18.10 release is also important for the upcoming 18.04.2 release. Read on to learn more...

We are preparing Ubuntu MATE 18.10 (Cosmic Cuttlefish) for distribution on October 18th, 2018 With this Beta pre-release, you can see what we are trying out in preparation for our next (stable) version.


Superposition on the Intel Core i7-8809G Radeon RX Vega M powered Hades Canyon NUC What works?

People tell us that Ubuntu MATE is stable. You may, or may not, agree.

Ubuntu MATE Beta Releases are NOT recommended for:

  • Regular users who are not aware of pre-release issues
  • Anyone who needs a stable system
  • Anyone uncomfortable running a possibly frequently broken system
  • Anyone in a production environment with data or workflows that need to be reliable

Ubuntu MATE Beta Releases are recommended for:

  • Regular users who want to help us test by finding, reporting, and/or fixing bugs
  • Ubuntu MATE, MATE, and GTK+ developers
What changed since the Ubuntu MATE 18.04 final release?

Curiously, the work during this Ubuntu MATE 18.10 release has really been focused on what will become Ubuntu MATE 18.04.2. Let me explain.

MATE Desktop

The upstream MATE Desktop team have been working on many bug fixes for MATE Desktop 1.20.x, that has resulted in a lot of maintenance updates in the upstream releases of MATE Desktop. The Debian packaging team for MATE Desktop, of which I am member, has been updating all the MATE packages to track these upstream bug fixes and new releases. Just about all MATE Desktop packages and associated components, such as AppMenu and MATE Dock Applet have been updated. Now that all these fixes exist in the 18.10 release, we will start the process of SRU'ing (backporting) them to 18.04 so that they will feature in the Ubuntu MATE 18.04.2 release due in February 2019. The fixes should start landing in Ubuntu MATE 18.04 very soon, well before the February deadline.

Hardware Enablement

Ubuntu MATE 18.04.2 will include a hardware enablement stack (HWE) based on what is shipped in Ubuntu 18.10. Ubuntu users are increasingly adopting the current generation of AMD RX Vega GPUs, both discrete and integrated solutions such as the Intel Core i7-8809G Radeon RX Vega M found in the Hades Canyon NUC and some laptops. I have been lobbying people within the Ubuntu project to upgrade to newer versions of the Linux kernel, firmware, Mesa and Vulkan that offer the best possible "out of box" support for AMD GPUs. Consequently, Ubuntu 18.10 (of any flavour) is great for owners of AMD graphics solutions and these improvements will soon be available in Ubuntu 18.04.2 too.

Download Ubuntu MATE 18.10 Beta

We've even redesigned the download page so it's even easier to get started.

Download Known Issues

Here are the known issues.

Ubuntu MATE
  • The Software Boutique doesn't list any available software.
    • An update, due very soon, will re-stock the software library and add a few new applications too.
Ubuntu family issues

This is our known list of bugs that affects all flavours.

You'll also want to check the Ubuntu MATE bug tracker to see what has already been reported. These issues will be addressed in due course.

Feedback

Is there anything you can help with or want to be involved in? Maybe you just want to discuss your experiences or ask the maintainers some questions. Please come and talk to us.

aristotledndalignments

Planet Debian - Enj, 27/09/2018 - 10:12md

Aristotle’s distinction in EN between brutishness and vice might be comparable to the distinction in Dungeons & Dragons between chaotic evil and lawful evil, respectively.

I’ve always thought that the forces of lawful evil are more deeply threatening than those of chaotic evil. In the Critical Hit podcast, lawful evil is equated with tyranny.

Of course, at least how I run it, Aristotelian ethics involves no notion of evil, only mistakes about the good.

Sean Whitton https://spwhitton.name//blog/ Notes from the Library

Debian Policy call for participation -- September 2018

Planet Debian - Enj, 27/09/2018 - 10:07md

Here’s a summary of some of the bugs against the Debian Policy Manual that are thought to be easy to resolve.

Please consider getting involved, whether or not you’re an existing contributor.

For more information, see our README.

#152955 force-reload should not start the daemon if it is not running

#172436 BROWSER and sensible-browser standardization

#188731 Also strip .comment and .note sections

#212814 Clarify relationship between long and short description

#273093 document interactions of multiple clashing package diversions

#314808 Web applications should use /usr/share/package, not /usr/share/doc/package

#348336 Clarify Policy around shared configuration files

#425523 Describe error unwind when unpacking a package fails

#491647 debian-policy: X font policy unclear around TTF fonts

#495233 debian-policy: README.source content should be more detailed

#649679 [copyright-format] Clarify what distinguishes files and stand-alone license paragraphs.

#682347 mark ‘editor’ virtual package name as obsolete

#685506 copyright-format: new Files-Excluded field

#685746 debian-policy Consider clarifying the use of recommends

#694883 copyright-format: please clarify the recommended form for public domain files

#696185 [copyright-format] Use short names from SPDX.

#697039 expand cron and init requirement to check binary existence to other scripts

#722535 debian-policy: To document: the “Binary-Only” field in Debian changes files.

#759316 Document the use of /etc/default for cron jobs

#770440 debian-policy: policy should mention systemd timers

#780725 PATH used for building is not specified

#794653 Recommend use of dpkg-maintscript-helper where appropriate

#809637 DEP-5 does not support filenames with blanks

#824495 debian-policy: Source packages “can” declare relationships

#833401 debian-policy: virtual packages: dbus-session-bus, dbus-default-session-bus

#845715 debian-policy: Please document that packages are not allowed to write outside their source directories

#850171 debian-policy: Addition of having an ‘EXAMPLES’ section in manual pages debian policy 12.1

#853779 debian-policy: Clarify requirements about update-rc.d and invoke-rc.d usage in maintainer scripts

#904248 Add netbase to build-essential

Sean Whitton https://spwhitton.name//blog/ Notes from the Library

My Work on Debian LTS (September 2018)

Planet Debian - Enj, 27/09/2018 - 11:40pd

In September 2018, I did 10 hours of work on the Debian LTS project as a paid contributor. Thanks to all LTS sponsors for making this possible.

This is my list of work done in September 2018:

  • Upload of polarssl (DLA 1518-1) [1].
  • Work on CVE-2018-16831 discovered in the smarty3 package. Plan (A) was to backport latest smarty3 release to Debian stretch and jessie, but runtime tests against GOsa² (one of the PHP applications that utilize smarty3) already failed for Debian stretch. So, this plan was dropped. Plan (B) then was extracting a patch [2] for fixing this issue in Debian stretch's smarty3 package version from a manifold of upstream code changes; finally with the realization that smarty3 in Debian jessie is very likely not affected. Upstream feedback is still pending, upload(s) will occur in the coming week (first week of Octobre).

light+love
Mike

References

[1] https://lists.debian.org/debian-lts-announce/2018/09/msg00029.html

[2] https://salsa.debian.org/debian/smarty3/commit/8a1eb21b7c4d971149e76cd2b...

sunweaver http://sunweavers.net/blog/blog/1 sunweaver's blog

A nice oneliner

Planet Debian - Mër, 26/09/2018 - 7:51md

Pop quiz! Let's say I have a datafile describing some items (images and feature points in this example):

# filename x y 000.jpg 79.932824 35.609049 000.jpg 95.174662 70.876506 001.jpg 19.655072 52.475315 002.jpg 19.515351 33.077847 002.jpg 3.010392 80.198282 003.jpg 84.183099 57.901647 003.jpg 93.237358 75.984036 004.jpg 99.102619 7.260851 005.jpg 24.738357 80.490116 005.jpg 53.424477 27.815635 .... .... 149.jpg 92.258132 99.284486

How do I get a random subset of N images, using only the shell and standard commandline tools?

Bam!

$ N=5; ( echo '# filename'; seq 0 149 | shuf | head -n $N | xargs -n1 printf "%03d.jpg\n" | sort) | vnl-join -j filename input.vnl - # filename x y 017.jpg 41.752204 96.753914 017.jpg 86.232504 3.936258 027.jpg 41.839110 89.148368 027.jpg 82.772742 27.880592 067.jpg 57.790706 46.153623 067.jpg 87.804939 15.853087 076.jpg 41.447477 42.844849 076.jpg 93.399829 64.552090 142.jpg 18.045497 35.381083 142.jpg 83.037867 17.252172 Dima Kogan http://notes.secretsauce.net Dima Kogan

Benjamin Mako Hill: Shannon’s Ghost

Planet Ubuntu - Mër, 26/09/2018 - 4:34pd

I’m spending the 2018-2019 academic year as a fellow at the Center for Advanced Study in the Behavioral Sciences (CASBS) at Stanford.

Claude Shannon on a bicycle.

Every CASBS study is labeled with a list of  “ghosts” who previously occupied the study. This year, I’m spending the year in Study 50 where I’m haunted by an incredible cast that includes many people whose scholarship has influenced and inspired me.

The top part of the list of ghosts in Study #50 at CASBS.

Foremost among this group is Study 50’s third occupant: Claude Shannon

At 21 years old, Shannon’s masters thesis (sometimes cited as the most important masters thesis in history) proved that electrical circuits could encode any relationship expressible in Boolean logic and opened the door to digital computing. Incredibly, this is almost never cited as Shannon’s most important contribution. That came in 1948 when he published a paper titled A Mathematical Theory of Communication which effectively created the field of information theory. Less than a decade after its publication, Aleksandr Khinchin (the mathematician behind my favorite mathematical constant) described the paper saying:

Rarely does it happen in mathematics that a new discipline achieves the character of a mature and developed scientific theory in the first investigation devoted to it…So it was with information theory after the work of Shannon.

As someone whose own research is seeking to advance computation and mathematical study of communication, I find it incredibly propitious to be sharing a study with Shannon.

Although I teach in a communication department, I know Shannon from my background in computing. I’ve always found it curious that, despite the fact Shannon’s 1948 paper is almost certainly the most important single thing ever published with the word “communication” in its title, Shannon is rarely taught in communication curricula is sometimes completely unknown to communication scholars.

In this regard, I’ve thought a lot about this passage in Robert’s Craig’s  influential article “Communication Theory as a Field” which argued:

In establishing itself under the banner of communication, the discipline staked an academic claim to the entire field of communication theory and research—a very big claim indeed, since communication had already been widely studied and theorized. Peters writes that communication research became “an intellectual Taiwan-claiming to be all of China when, in fact, it was isolated on a small island” (p. 545). Perhaps the most egregious case involved Shannon’s mathematical theory of information (Shannon & Weaver, 1948), which communication scholars touted as evidence of their field’s potential scientific status even though they had nothing whatever to do with creating it, often poorly understood it, and seldom found any real use for it in their research.

In preparation for moving into Study 50, I read a new biography of Shannon by Jimmy Soni and Rob Goodman and was excited to find that Craig—although accurately describing many communication scholars’ lack of familiarity—almost certainly understated the importance of Shannon to communication scholarship.

For example, the book form of Shannon’s 1948 article was published by University Illinois on the urging of and editorial supervision of Wilbur Schramm (one of the founders of modern mass communication scholarship) who was a major proponent of Shannon’s work. Everett Rogers (another giant in communication) devotes a chapter of his “History of Communication Studies”² to Shannon and to tracing his impact in communication. Both Schramm and Rogers built on Shannon in parts of their own work. Shannon has had an enormous impact, it turns out, in several subareas of communication research (e.g., attempts to model communication processes).

Although I find these connections exciting. My own research—like most of the rest of communication—is far from the substance of technical communication processes at the center of Shannon’s own work. In this sense, it can be a challenge to explain to my colleagues in communication—and to my fellow CASBS fellows—why I’m so excited to be sharing a space with Shannon this year.

Upon reflection, I think it boils down to two reasons:

  1. Shannon’s work is both mathematically beautiful and incredibly useful. His seminal 1948 article points to concrete ways that his theory can be useful in communication engineering including in compression, error correcting codes, and cryptography. Shannon’s focus on research that pushes forward the most basic type of basic research while remaining dedicated to developing solutions to real problems is a rare trait that I want to feature in my own scholarship.
  2. Shannon was incredibly playful. Shannon played games, juggled constantly, and was always seeking to teach others to do so. He tinkered, rode unicycles, built a flame-throwing trumpet, and so on. With Marvin Minsky, he invented the “ultimate machine”—a machine that’s only function is to turn itself off—which he kept on his desk.

    A version of the Shannon’s “ultimate machine” that is sitting on my desk at CASBS.

I have no misapprehension that I will accomplish anything like Shannon’s greatest intellectual achievements during my year at CASBS. I do hope to be inspired by Shannon’s creativity, focus on impact, and playfulness. In my own little ways, I hope to build something at CASBS that will advance mathematical and computational theory in communication in ways that Shannon might have appreciated.

  1. Incredibly, the year that Shannon was in Study 50, his neighbor in Study 51 was Milton Friedman. Two thoughts: (i) Can you imagine?! (ii) I definitely chose the right study!
  2. Rogers book was written, I found out, during his own stint at CASBS. Alas, it was not written in Study 50.

Shannon’s Ghost

Planet Debian - Mër, 26/09/2018 - 4:34pd

I’m spending the 2018-2019 academic year as a fellow at the Center for Advanced Study in the Behavioral Sciences (CASBS) at Stanford.

Claude Shannon on a bicycle.

Every CASBS study is labeled with a list of  “ghosts” who previously occupied the study. This year, I’m spending the year in Study 50 where I’m haunted by an incredible cast that includes many people whose scholarship has influenced and inspired me.

The top part of the list of ghosts in Study #50 at CASBS.

Foremost among this group is Study 50’s third occupant: Claude Shannon

At 21 years old, Shannon’s masters thesis (sometimes cited as the most important masters thesis in history) proved that electrical circuits could encode any relationship expressible in Boolean logic and opened the door to digital computing. Incredibly, this is almost never cited as Shannon’s most important contribution. That came in 1948 when he published a paper titled A Mathematical Theory of Communication which effectively created the field of information theory. Less than a decade after its publication, Aleksandr Khinchin (the mathematician behind my favorite mathematical constant) described the paper saying:

Rarely does it happen in mathematics that a new discipline achieves the character of a mature and developed scientific theory in the first investigation devoted to it…So it was with information theory after the work of Shannon.

As someone whose own research is seeking to advance computation and mathematical study of communication, I find it incredibly propitious to be sharing a study with Shannon.

Although I teach in a communication department, I know Shannon from my background in computing. I’ve always found it curious that, despite the fact Shannon’s 1948 paper is almost certainly the most important single thing ever published with the word “communication” in its title, Shannon is rarely taught in communication curricula is sometimes completely unknown to communication scholars.

In this regard, I’ve thought a lot about this passage in Robert’s Craig’s  influential article “Communication Theory as a Field” which argued:

In establishing itself under the banner of communication, the discipline staked an academic claim to the entire field of communication theory and research—a very big claim indeed, since communication had already been widely studied and theorized. Peters writes that communication research became “an intellectual Taiwan-claiming to be all of China when, in fact, it was isolated on a small island” (p. 545). Perhaps the most egregious case involved Shannon’s mathematical theory of information (Shannon & Weaver, 1948), which communication scholars touted as evidence of their field’s potential scientific status even though they had nothing whatever to do with creating it, often poorly understood it, and seldom found any real use for it in their research.

In preparation for moving into Study 50, I read a new biography of Shannon by Jimmy Soni and Rob Goodman and was excited to find that Craig—although accurately describing many communication scholars’ lack of familiarity—almost certainly understated the importance of Shannon to communication scholarship.

For example, the book form of Shannon’s 1948 article was published by University Illinois on the urging of and editorial supervision of Wilbur Schramm (one of the founders of modern mass communication scholarship) who was a major proponent of Shannon’s work. Everett Rogers (another giant in communication) devotes a chapter of his “History of Communication Studies”² to Shannon and to tracing his impact in communication. Both Schramm and Rogers built on Shannon in parts of their own work. Shannon has had an enormous impact, it turns out, in several subareas of communication research (e.g., attempts to model communication processes).

Although I find these connections exciting. My own research—like most of the rest of communication—is far from the substance of technical communication processes at the center of Shannon’s own work. In this sense, it can be a challenge to explain to my colleagues in communication—and to my fellow CASBS fellows—why I’m so excited to be sharing a space with Shannon this year.

Upon reflection, I think it boils down to two reasons:

  1. Shannon’s work is both mathematically beautiful and incredibly useful. His seminal 1948 article points to concrete ways that his theory can be useful in communication engineering including in compression, error correcting codes, and cryptography. Shannon’s focus on research that pushes forward the most basic type of basic research while remaining dedicated to developing solutions to real problems is a rare trait that I want to feature in my own scholarship.
  2. Shannon was incredibly playful. Shannon played games, juggled constantly, and was always seeking to teach others to do so. He tinkered, rode unicycles, built a flame-throwing trumpet, and so on. With Marvin Minsky, he invented the “ultimate machine”—a machine that’s only function is to turn itself off—which he kept on his desk.

    A version of the Shannon’s “ultimate machine” that is sitting on my desk at CASBS.

I have no misapprehension that I will accomplish anything like Shannon’s greatest intellectual achievements during my year at CASBS. I do hope to be inspired by Shannon’s creativity, focus on impact, and playfulness. In my own little ways, I hope to build something at CASBS that will advance mathematical and computational theory in communication in ways that Shannon might have appreciated.

  1. Incredibly, the year that Shannon was in Study 50, his neighbor in Study 51 was Milton Friedman. Two thoughts: (i) Can you imagine?! (ii) I definitely chose the right study!
  2. Rogers book was written, I found out, during his own stint at CASBS. Alas, it was not written in Study 50.
Benjamin Mako Hill https://mako.cc/copyrighteous copyrighteous

Stephen Michael Kellat: Work Items To Remember

Planet Ubuntu - Mër, 26/09/2018 - 4:30pd

Sometimes I truly cannot remember everything. There have been many, many things going on as of late. Being on medical leave has not been helpful, either.

As we look to the last quarter of 2018, there are some matters I need to remind myself about keeping in the work plan:

  1. Finish the write-up on the research for Outernet/Othernet.

  2. Begin looking at what I need to do to set up a FidoNet node. I haven’t been involved in FidoNet since high school during President Bill Clinton’s second term in office.

  3. Consider the possibility that the folks of DarkNetPlan failed. After looking at this post I honestly need to look at finding a micrographics artist that I can set up a working relationship with. Passing digital data via microfilm sounds old-fashioned but seems more durable these days.

  4. Construct a proper permanent HF antenna for operating. I am a ham radio operator with General class privileges in the United States that remain barely used even though I am only a few years away from joining the Quarter Century Wireless Association.

  5. Figure out what I’m doing wrong setting up multiple HDHomeRun receivers to be tapped by a PVR-styled computer.

  6. Pick up 18 graduate semester hours so I can teach as an adjunct somewhere. This would generally have to happen in a graduate certificate program in the US or at the halfway mark in a master’s degree program.

With my day job being constantly in flux, I am sure I’ve missed something in the listing above.

GSoC 2018: Final Report

Planet Debian - Mar, 25/09/2018 - 7:08md


This is my final report of my Google Summer of Code 2018, it also serves as my final code submission.

For the last 3 months I have been working with Debian on the project Extracting Data from PDF Invoices and Bills Details. Information about the project can be found here: 

https://wiki.debian.org/SummerOfCode2018/Projects/ExtractingDataFromPDFInvoicesAndBillsDetails.

My mentor and I agreed to modify the work to be done in the Summer. Already discussed here: http://blog.harshitjoshi.in/2018/05/gsoc-2018-debian-community-bonding.html
We will advance the ecosystem for machine-readable invoice exchange and make it easily accessible for the whole Python community by making the following contributions:
  • Python library to read/write/add/edit Factur-x metadata in different XML-flavors in Python.
  • Command line interface to process PDF files and access the main library functions.
  • Way to add structured data to existing files or from legacy accounting systems. (via invoice2data project)
  • New desktop GUI to add, edit, import and export factur-x metadata in- and out of PDF files.
Short overviewThe project work can be bifurcated into two parts:
  • Main Deliverable: GUI creation for Factur-X Library
  • Pre-requisites for Main Deliverable: Improvements to invoice2data library and updating Factur-X library to a working state
Contributions to invoice2dataA modular Python library to support your accounting process. Tested on Python 2.7 and 3.4+. Main steps:
  1. extracts text from PDF files using different techniques, like pdftotext, pdfminer or tesseract OCR.
  2. searches for regex in the result using a YAML-based template system
  3. saves results as CSV, JSON or XML or renames PDF files to match the content.
My contributions: https://github.com/invoice-x/invoice2data/commits?author=duskybomb
Contributions to Factur-XFactur-X is a EU standard for embedding XML representations of invoices in PDF files. This library provides an interface for reading, editing and saving the this metadata. My contributions: https://github.com/invoice-x/factur-x-ng/commits?author=duskybomb
Organisation PageAn organisation created on github, invoice-x, to tie down all the repository at a single place.
link to organisation page: https://github.com/invoice-x/
Organisation Website A static website briefly explaining the whole project. Link to website: https://www.invoice-x.org/
Main Deliverable RepositoryThis repository contains the code to make GUI for Factur-x Library. Link to the repository: https://github.com/invoice-x/invoicex-gui

invoicex-gui: invoice2data integration with invoicex-gui and factur-x-ng
OverviewPre-requisites for Main DeliverableFactur-XTo work on GUI creation for Factur-X, I first needed to update Factur-x library to a working state. My mentor, Manuel, did the initial refactoring of the project after forking the original repository, https://github.com/akretion/factur-x.

Since then I have added a few features to the library:
  • Fix checking of embedded resources
  • Converting the documentation format from md to rst
  • Added unit tests for factur-x
  • Added new feature to export metadata in JSON and YAML format
  • Cleaned XML template to add
  • Added validation of country and currency codes with ISO standards.
  • Implemented Command Line Options
Invoice2dataI started contributing to invoice2data in the month of February. Invoice2data became the first open source project I contributed to. The first contribution was just fixing a typo in the documentation, but this introduced me to the world of Free Open Source Software (FOSS).

Since, I have been selected for Google Summer of Code 2018, I have added the following commits:
  • Removed required fields in favour of providing flexibility to extract data
  • Added feature to extract all fields mentioned in template
  • Updated README and worked on conversion of md to rst
  • Added checks for dependencies: tesseract and imagemagick
  • Changed subprocess input form normal string to list
  • Added more tests and checked coverage locally
  • Fixed the ways invoice2data handles lists
Main DeliverableInvoicex-GUI My main deliverable was to make Graphical User Interface for Factur-X library. For this I used PyQt-5 framework. The other options for the same were Kivy and wxWidgets. I have some prior experience with PyQt-5 and a bug in Kivy related to touchpad driver of Debian inclined me to use PyQt-5.

The making the GUI was one of the most challenging part of the GSoC project. The lack of documentation for PyQt-5 didn’t help much. I have 3 years of experience with C++ and used it to learn more about PyQt-5 through its original documentation for Qt which is in C++.

The GUI includes:
  • Selected PDF and searching for any embedded standard
  • If no standard is found, give a pop up to select the standard to be added
  • Edit metadata of existing embedded standard
  • Export metadata
  • Validate Metadata
  • Use invoice2data to extract field data from invoice
Weekly Work Donehttps://lists.debian.org/debian-outreach/2018/05/msg00015.html (week 1)
https://lists.debian.org/debian-outreach/2018/05/msg00029.html (week 2)
https://lists.debian.org/debian-outreach/2018/06/msg00003.html (week 3)
https://lists.debian.org/debian-outreach/2018/06/msg00029.html (week 4)
https://lists.debian.org/debian-outreach/2018/06/msg00078.html (week 5)
https://lists.debian.org/debian-outreach/2018/06/msg00106.html (week 6)
https://lists.debian.org/debian-outreach/2018/06/msg00136.html (week 7)
https://lists.debian.org/debian-outreach/2018/07/msg00019.html (week 8)
https://lists.debian.org/debian-outreach/2018/07/msg00072.html (week 9, 10)
https://lists.debian.org/debian-outreach/2018/07/msg00105.html (week 11)
https://lists.debian.org/debian-outreach/2018/08/msg00011.html (week 12)  Harshit Joshi noreply@blogger.com Harshit Joshi's Blog

Reproducible Builds: Weekly report #178

Planet Debian - Mar, 25/09/2018 - 6:53md

Here’s what happened in the Reproducible Builds effort between Sunday September 16 and Saturday September 22 2018:

Patches filed diffoscope development

diffoscope version 102 was uploaded to Debian unstable by Mattia Rizzolo. It included contributions already covered in previous weeks as well as new ones from:

Test framework development

There were a number of updates to our Jenkins-based testing framework that powers tests.reproducible-builds.org this month, including:

Misc.

This week’s edition was written by Bernhard M. Wiedemann, Chris Lamb, Daniel Shahaf, Holger Levsen, Jelle van der Waa, Vagrant Cascadian & reviewed by a bunch of Reproducible Builds folks on IRC & the mailing lists.

Reproducible builds folks https://reproducible-builds.org/blog/ reproducible-builds.org

Crossing the Great St Bernard Pass

Planet Debian - Mar, 25/09/2018 - 4:26md

It's a great day for the scenic route to Italy, home of Beethoven's Swiss cousins.

What goes up, must come down...

Descent into the Aosta valley

Daniel.Pocock https://danielpocock.com/tags/debian DanielPocock.com - debian

Smallish haul

Planet Debian - Mar, 25/09/2018 - 6:34pd

It's been a little while since I've made one of these posts, and of course I'm still picking up this and that. Books won't buy themselves!

Elizabeth Bear & Katherine Addison — The Cobbler's Boy (sff)
P. Djèlí Clark — The Black God's Drums (sff)
Sabine Hossenfelder — Lost in Math (nonfiction)
N.K. Jemisin — The Dreamblood Duology (sff)
Mary Robinette Kowal — The Calculating Stars (sff)
Yoon Ha Lee — Extracurricular Activities (sff)
Seanan McGuire — Night and Silence (sff)
Bruce Schneier — Click Here to Kill Everyone (nonfiction)

I have several more pre-orders that will be coming out in the next couple of months. Still doing lots of reading, but behind on writing up reviews, since work has been busy and therefore weekends have been low-energy. That should hopefully change shortly.

Russ Allbery https://www.eyrie.org/~eagle/ Eagle's Path

Archiving web sites

Planet Debian - Mar, 25/09/2018 - 2:00pd

I recently took a deep dive into web site archival for friends who were worried about losing control over the hosting of their work online in the face of poor system administration or hostile removal. This makes web site archival an essential instrument in the toolbox of any system administrator. As it turns out, some sites are much harder to archive than others. This article goes through the process of archiving traditional web sites and shows how it falls short when confronted with the latest fashions in the single-page applications that are bloating the modern web.

Converting simple sites

The days of handcrafted HTML web sites are long gone. Now web sites are dynamic and built on the fly using the latest JavaScript, PHP, or Python framework. As a result, the sites are more fragile: a database crash, spurious upgrade, or unpatched vulnerability might lose data. In my previous life as web developer, I had to come to terms with the idea that customers expect web sites to basically work forever. This expectation matches poorly with "move fast and break things" attitude of web development. Working with the Drupal content-management system (CMS) was particularly challenging in that regard as major upgrades deliberately break compatibility with third-party modules, which implies a costly upgrade process that clients could seldom afford. The solution was to archive those sites: take a living, dynamic web site and turn it into plain HTML files that any web server can serve forever. This process is useful for your own dynamic sites but also for third-party sites that are outside of your control and you might want to safeguard.

For simple or static sites, the venerable Wget program works well. The incantation to mirror a full web site, however, is byzantine:

$ nice wget --mirror --execute robots=off --no-verbose --convert-links \ --backup-converted --page-requisites --adjust-extension \ --base=./ --directory-prefix=./ --span-hosts \ --domains=www.example.com,example.com http://www.example.com/

The above downloads the content of the web page, but also crawls everything within the specified domains. Before you run this against your favorite site, consider the impact such a crawl might have on the site. The above command line deliberately ignores [robots.txt][] rules, as is now common practice for archivists, and hammer the website as fast as it can. Most crawlers have options to pause between hits and limit bandwidth usage to avoid overwhelming the target site.

The above command will also fetch "page requisites" like style sheets (CSS), images, and scripts. The downloaded page contents are modified so that links point to the local copy as well. Any web server can host the resulting file set, which results in a static copy of the original web site.

That is, when things go well. Anyone who has ever worked with a computer knows that things seldom go according to plan; all sorts of things can make the procedure derail in interesting ways. For example, it was trendy for a while to have calendar blocks in web sites. A CMS would generate those on the fly and make crawlers go into an infinite loop trying to retrieve all of the pages. Crafty archivers can resort to regular expressions (e.g. Wget has a --reject-regex option) to ignore problematic resources. Another option, if the administration interface for the web site is accessible, is to disable calendars, login forms, comment forms, and other dynamic areas. Once the site becomes static, those will stop working anyway, so it makes sense to remove such clutter from the original site as well.

JavaScript doom

Unfortunately, some web sites are built with much more than pure HTML. In single-page sites, for example, the web browser builds the content itself by executing a small JavaScript program. A simple user agent like Wget will struggle to reconstruct a meaningful static copy of those sites as it does not support JavaScript at all. In theory, web sites should be using progressive enhancement to have content and functionality available without JavaScript but those directives are rarely followed, as anyone using plugins like NoScript or uMatrix will confirm.

Traditional archival methods sometimes fail in the dumbest way. When trying to build an offsite backup of a local newspaper (pamplemousse.ca), I found that WordPress adds query strings (e.g. ?ver=1.12.4) at the end of JavaScript includes. This confuses content-type detection in the web servers that serve the archive, which rely on the file extension to send the right Content-Type header. When such an archive is loaded in a web browser, it fails to load scripts, which breaks dynamic websites.

As the web moves toward using the browser as a virtual machine to run arbitrary code, archival methods relying on pure HTML parsing need to adapt. The solution for such problems is to record (and replay) the HTTP headers delivered by the server during the crawl and indeed professional archivists use just such an approach.

Creating and displaying WARC files

At the Internet Archive, Brewster Kahle and Mike Burner designed the ARC (for "ARChive") file format in 1996 to provide a way to aggregate the millions of small files produced by their archival efforts. The format was eventually standardized as the WARC ("Web ARChive") specification that was released as an ISO standard in 2009 and revised in 2017. The standardization effort was led by the International Internet Preservation Consortium (IIPC), which is an "international organization of libraries and other organizations established to coordinate efforts to preserve internet content for the future", according to Wikipedia; it includes members such as the US Library of Congress and the Internet Archive. The latter uses the WARC format internally in its Java-based Heritrix crawler.

A WARC file aggregates multiple resources like HTTP headers, file contents, and other metadata in a single compressed archive. Conveniently, Wget actually supports the file format with the --warc parameter. Unfortunately, web browsers cannot render WARC files directly, so a viewer or some conversion is necessary to access the archive. The simplest such viewer I have found is pywb, a Python package that runs a simple webserver to offer a Wayback-Machine-like interface to browse the contents of WARC files. The following set of commands will render a WARC file on http://localhost:8080/:

$ pip install pywb $ wb-manager init example $ wb-manager add example crawl.warc.gz $ wayback

This tool was, incidentally, built by the folks behind the Webrecorder service, which can use a web browser to save dynamic page contents.

Unfortunately, pywb has trouble loading WARC files generated by Wget because it followed an inconsistency in the 1.0 specification, which was fixed in the 1.1 specification. Until Wget or pywb fix those problems, WARC files produced by Wget are not reliable enough for my uses, so I have looked at other alternatives. A crawler that got my attention is simply called crawl. Here is how it is invoked:

$ crawl https://example.com/

(It does say "very simple" in the README.) The program does support some command-line options, but most of its defaults are sane: it will fetch page requirements from other domains (unless the -exclude-related flag is used), but does not recurse out of the domain. By default, it fires up ten parallel connections to the remote site, a setting that can be changed with the -c flag. But, best of all, the resulting WARC files load perfectly in pywb.

Future work and alternatives

There are plenty more resources for using WARC files. In particular, there's a Wget drop-in replacement called Wpull that is specifically designed for archiving web sites. It has experimental support for PhantomJS and youtube-dl integration that should allow downloading more complex JavaScript sites and streaming multimedia, respectively. The software is the basis for an elaborate archival tool called ArchiveBot, which is used by the "loose collective of rogue archivists, programmers, writers and loudmouths" at ArchiveTeam in its struggle to "save the history before it's lost forever". It seems that PhantomJS integration does not work as well as the team wants, so ArchiveTeam also uses a rag-tag bunch of other tools to mirror more complex sites. For example, snscrape will crawl a social media profile to generate a list of pages to send into ArchiveBot. Another tool the team employs is crocoite, which uses the Chrome browser in headless mode to archive JavaScript-heavy sites.

This article would also not be complete without a nod to the HTTrack project, the "website copier". Working similarly to Wget, HTTrack creates local copies of remote web sites but unfortunately does not support WARC output. Its interactive aspects might be of more interest to novice users unfamiliar with the command line.

In the same vein, during my research I found a full rewrite of Wget called Wget2 that has support for multi-threaded operation, which might make it faster than its predecessor. It is missing some features from Wget, however, most notably reject patterns, WARC output, and FTP support but adds RSS, DNS caching, and improved TLS support.

Finally, my personal dream for these kinds of tools would be to have them integrated with my existing bookmark system. I currently keep interesting links in Wallabag, a self-hosted "read it later" service designed as a free-software alternative to Pocket (now owned by Mozilla). But Wallabag, by design, creates only a "readable" version of the article instead of a full copy. In some cases, the "readable version" is actually unreadable and Wallabag sometimes fails to parse the article. Instead, other tools like bookmark-archiver or reminiscence save a screenshot of the page along with full HTML but, unfortunately, no WARC file that would allow an even more faithful replay.

The sad truth of my experiences with mirrors and archival is that data dies. Fortunately, amateur archivists have tools at their disposal to keep interesting content alive online. For those who do not want to go through that trouble, the Internet Archive seems to be here to stay and Archive Team is obviously working on a backup of the Internet Archive itself.

This article first appeared in the Linux Weekly News.

As usual, here's the list of issues and patches generated while researching this article:

I also want to personally thank the folks in the #archivebot channel for their assistance and letting me play with their toys.

The Pamplemousse crawl is now available on the Internet Archive, it might end up in the wayback machine at some point if the Archive curators think it is worth it.

Another example of a crawl is this archive of two Bloomberg articles which the "save page now" feature of the Internet archive wasn't able to save correctly. But webrecorder.io could! Those pages can be seen in the web recorder player to get a better feel of how faithful a WARC file really is.

Finally, this article was originally written as a set of notes and documentation in the archive page which may also be of interest to my readers.

Antoine Beaupré https://anarc.at/tag/debian-planet/ pages tagged debian-planet

VLC in Debian now can do bittorrent streaming

Planet Debian - Hën, 24/09/2018 - 9:20md

Back in February, I got curious to see if VLC now supported Bittorrent streaming. It did not, despite the fact that the idea and code to handle such streaming had been floating around for years. I did however find a standalone plugin for VLC to do it, and half a year later I decided to wrap up the plugin and get it into Debian. I uploaded it to NEW a few days ago, and am very happy to report that it entered Debian a few hours ago, and should be available in Debian/Unstable tomorrow, and Debian/Testing in a few days.

With the vlc-plugin-bittorrent package installed you should be able to stream videos using a simple call to

vlc https://archive.org/download/TheGoat/TheGoat_archive.torrent

It can handle magnet links too. Now if only native vlc had bittorrent support. Then a lot more would be helping each other to share public domain and creative commons movies. The plugin need some stability work with seeking and picking the right file in a torrent with many files, but is already usable. Please note that the plugin is not removing downloaded files when vlc is stopped, so it can fill up your disk if you are not careful. Have fun. :)

I would love to get help maintaining this package. Get in touch if you are interested.

As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

Petter Reinholdtsen http://people.skolelinux.org/pere/blog/ Petter Reinholdtsen - Entries tagged english

Riccardo Padovani: Responsible disclosure: retrieving a user's private Facebook friends.

Planet Ubuntu - Dje, 23/09/2018 - 11:00pd

Data access control isn’t easy. While it can sound quite simple (just give access to the authorized entities), it is very difficult, both on a theoretical side (who is an authorized entity? What does authorized mean? And how do we identify an entity?) and on a pratical side.

On the pratical side, how we will see, disclose of private data is often a unwanted side effect of an useful feature.

Facebook and Instagram

Facebook bought Instagram back in 2012. Since then, a lot of integrations have been implemented between them: among the others, when you suscribe to Instagram, it will suggest you who to follow based on your Facebook friends.

Your Instagram and Facebook accounts are then somehow linked: it happens both if you sign up to Instagram using your Facebook account (doh!), but also if you sign up to Instagram creating a new account but using the same email you use in your Facebook account (there are also other way Instagram links your new account with an existing Facebook account, but they are not of our interest here).

So if you want to create a secret Instagram account, create a new mail for it ;-)

Back in topic: Instagram used to enable all its feature to new users, before they have confirmed their email address. This was to do not “interrupt” usage of the website / app, they would have been time to confirm the email later in their usage.

Email address confirmation is useful to confirm you are signing up using your own email address, and not one of someone else.

Data leak

One of the features available before confirming the email address, was the suggestion of who to follow based on the Facebook friends of the account Instagram automatically linked.

This made super easy to retrieve the Facebook’s friend list of anyone who doesn’t have an Instagram account, and since there are more than 2 billions Facebook accounts but just 800 millions Instagram accounts, it means that at least 1 billion and half accounts were vulnerable.

The method was simple: knowing the email address of the target (and an email address is all but secret), the attacker had just to sign up to Instagram with that email, and then go to the suggestions of people to follow to see victim’s friends.

Conclusion

The combination of two useful features (suggestion of people to follow based on a linked Facebook account, being able to use the new Instagram account immediately) made this data leak possible.

It wasn’t important if the attacker was a Facebook’s friend with the victim, or the privacy settings of the victim’s account on Facebook. Heck, the attacker didn’t need a Facebook account at all!

Timeline
  • 20 August 2018: first disclosure to Facebook
  • 20 August 2018: request of other information from Facebook
  • 20 August 2018: more information provided to Facebook
  • 21 August 2018: Facebook closed the issue saying wasn’t a security issue
  • 21 August 2018: I submitted a new demo with more information
  • 23 August 2018: Facebook confirmed the issue
  • 30 August 2018: Facebook deployed a fix and asked for a test
  • 12 September 2018: Facebook awarded me a bounty
Bounty

Facebook awarded me a $3000 bounty award for the disclosure. This was the first time I was awarded for a security disclosure for Facebook, I am quite happy with the result and I applaude Facebook for making all the process really straightforward.

For any comment, feedback, critic, write me on Twitter (@rpadovani93) or drop an email at riccardo@rpadovani.com.

Regards, R.

Stephen Michael Kellat: And Another Thing

Planet Ubuntu - Dje, 23/09/2018 - 4:53pd

My Zotero database has some unfortunate comparisons and contrasts in it. For example:

Crowe, J. (2018, September 21). Google Employees Considered Changing Search Algorithm to Fight Travel Ban. Retrieved September 22, 2018, from https://www.nationalreview.com/news/google-employees-considered-changing-search-algorithm-to-fight-travel-ban/

Not the happiest of news that, apparently, President Donald John Trump isn't totally unjustified in his paranoia. The blackbox that is search at Google can potentially be tampered with. Without any understanding of what goes on inside Google's "black box" system, there isn't really much to assuage President Trump's fears.

That this sort of a possibility could come up in 2018 should not be surprising. After all, here are some further citations from my Zotero database:

Kellat, S. M. (2006). Intellectual Terrorism and the Church: The Case of the Google Bomb. Conference paper. Retrieved from http://eprints.rclis.org/10147/

Kellat, S. M. (2007). Print-Based Culture Meets An “Amazoogle” World: New Challenges To A Priesthood of Readers. Conference paper. Retrieved from http://eprints.rclis.org/10146/

I suppose I merely wrote about the matter initially in terms of malicious external actors twelve years ago. The idea of internal malicious actors came up eleven years ago in my writing. After that I began following the various color uprisings and the like but forgot to keep writing. I used to be a working academic but for some reason detoured into being a tax collector these days after spending time as a podcaster.

There seems to be low-hanging fruit to pursue again in research about this digital life.

Handling an old Digital Photo Frame (AX203) with Debian (and gphoto2)

Planet Debian - Sht, 22/09/2018 - 8:20md

Some days ago I found an key chain at home that was a small digital photo frame, and it seems that was not used since 2009 (old times when I was not using Debian at home yet). The photo frame was still working (I connected it with an USB cable and after some seconds, it turned on), and showed 37 photos from 2009 indeed.

When I connected it with USB cable to the computer, it was asking “Connect USB? Yes/No” I pressed the button saying “yes” and nothing happened in the computer (I was expecting an USB drive to be shown in Dolphin, but no).

I looked at “dmesg” output and it was shown as a CDROM:

[ 1620.497536] usb 3-2: new full-speed USB device number 4 using xhci_hcd [ 1620.639507] usb 3-2: New USB device found, idVendor=1908, idProduct=1320 [ 1620.639513] usb 3-2: New USB device strings: Mfr=1, Product=2, SerialNumber=0 [ 1620.639515] usb 3-2: Product: Photo Frame [ 1620.639518] usb 3-2: Manufacturer: BUILDWIN [ 1620.640549] usb-storage 3-2:1.0: USB Mass Storage device detected [ 1620.640770] usb-storage 3-2:1.0: Quirks match for vid 1908 pid 1320: 20000 [ 1620.640807] scsi host7: usb-storage 3-2:1.0 [ 1621.713594] scsi 7:0:0:0: CD-ROM buildwin Photo Frame 1.01 PQ: 0 ANSI: 2 [ 1621.715400] sr 7:0:0:0: [sr1] scsi3-mmc drive: 40x/40x writer cd/rw xa/form2 cdda tray [ 1621.715745] sr 7:0:0:0: Attached scsi CD-ROM sr1 [ 1621.715932] sr 7:0:0:0: Attached scsi generic sg1 type 5

But not automounted.
I mounted it and then looked at the files, but I couldn’t find photos there, only these files:

Autorun.inf FEnCodeUnicode.dll LanguageUnicode.ini DPFMate.exe flashlib.dat StartInfoUnicode.ini

The Autorun.inf file was pointing to the DPFMate.exe file.

I connected the device to a Windows computer and then I could run the DPFMate.exe program, and it was a program to manage the photos in the device.

I was wondering if I could manage the device from Debian and then searched for «dpf “digital photo frame” linux dpfmate» and found this page:

http://www.penguin.cz/~utx/hardware/Abeyerr_Digital_Photo_Frame/

Yes, that one was my key chain!

I looked for gphoto in Debian, going to https://packages.debian.org/gphoto and then learned that the program I need to install was gphoto2.
I installed it and then went to its Quick Start Guide to learn how to access the device, get the photos etc. In particular, I used these commands:

gphoto2 --auto-detect Model Port ---------------------------------------------------------- AX203 USB picture frame firmware ver 3.4.x usbscsi:/dev/sg1 gphoto2 --get-all-files

(it copied all the pictures that were in the photo frame, to the current folder in my computer)

gphoto2 --upload-file=name_of_file

(to put some file in the photo frame)

gphoto2 --delete-file=1-38

(to delete the file 1 to 38 in the photo frame).

larjona https://larjona.wordpress.com English – The bright side

Faqet

Subscribe to AlbLinux agreguesi