over 1 year ago

從 郵局的網站 找到全省縣市鄉鎮的資料
衛福部找到登記在案的婦產科,跟用google找到的有很大的誤差,應該是很多婦產科其實是沒有自己的網站吧。

放在Excel的檔案裡面(沒辦法,跟Numbers不熟,用起來不順手。)
所以要找一個可以讀取excel的gem,
可是好像都不是很好用,最後找到spreadsheet
剛好可以滿足我的需求。

How do I read the content of an Excel spreadsheet using Ruby?

How do I read the content of an Excel spreadsheet using Ruby?
require 'spreadsheet'    
book = Spreadsheet.open('myexcel.xls')
sheet1 = book.worksheet('Sheet1') # can use an index or worksheet name

sheet1.each do |row|
  break if row[0].nil? # if first cell empty

  puts row.join(',') # looks like it calls "to_s" on each cell's Value

end

上面這招就很夠用了。

幾個注意的點:
1. Numbers要用輸出,能夠變成 .xls 跟 .xlsx,兩種檔案格式。可是spreadsheet gem只能讀.xls
2. irb跟cloud9(rails c)裡面,可以用require "spreadsheet"
3. 可是在rails c 裡面,要用require "Spreadsheet",記得要大寫
4. 好像還有其他解法:
Gem loads in irb but not console
Gem available in irb but not rails console
How to use a local gem in console with bundled environment
rails console require nokogiri return false (but works in irb)
5. 讀取excel還挺麻煩的,目前是覺得把他弄成一列一列的好像比較簡單,row。因為好像沒看到column的用法。所以都要把檔案調整一下。
6. numbers or excel裡面的公式,要記得調成。不然spreadsheet在處理的時候,會跑出來Formula
7. cloud9,如果資料庫設成mysql,每次進入rails c 之前要記得先開啟mysql-ctl start

rails c 試用暖身看看
require 'Spreadsheet'    
book = Spreadsheet.open('hospitals.xls')
sheet1 = book.worksheet('婦產科')
sheet1.each do |row|
  break if row[0].nil? # if first cell empty

  puts row.join(',') # looks like it calls "to_s" on each cell's Value

end

處理城市的資料:

處理城市
book = Spreadsheet.open('hospitals.xls')
sheet1 = book.worksheet('郵遞區號 (3)')

sheet1.each do |row|
  break if row[0].nil? # if first cell empty

  puts row.join(',') # looks like it calls "to_s" on each cell's Value

end

#把城市(City)的資料加到array裡面

city_name = []
sheet1.each do |row|
  break if row[0].nil? 
  city_name << row[1]
end;true

city_name.each do |name|
    @city = City.new
    @city.name = name
    @city.save
end

處理鄉鎮的資料:

處理鄉鎮的資料
book = Spreadsheet.open('hospitals.xls')
sheet1 = book.worksheet('郵遞區號 (3)')

#把鄉鎮的資料加到array裡面

raw_data =[]
sheet1.each do |row|
  break if row[0].nil?
  row.each do |data|
    raw_data << data
  end
end

#把nil的資料刪掉

list_data = raw_data.delete_if{|i| i.nil?}
#或是list_data = raw_data.compact!


#把數字的資料刪掉

final_data = list_data.delete_if{|i| i.is_a?(Numeric)}

#把鄉鎮按照城市別分類

temp = []
final_data.each do |str|
  case str
  when "臺北市" , "基隆市" , "新北市", "宜蘭縣", "新竹市", "新竹縣", "桃園市", "苗栗縣", "臺中市", "彰化縣", "南投縣", "嘉義市", "嘉義縣", "雲林縣", "臺南市", "高雄市", "南海諸島", "澎湖縣", "屏東縣", "臺東縣","花蓮縣", "金門縣", "連江縣"
    temp << []
  else
    temp[-1] << str
  end
end



#測試

i = 0
begin
temp[i].each do |p|
 puts p + "city_id=" + (i+1).to_s
 end
i += 1
end while i < 23

#存到區域(District)的資料庫去

i = 0
begin
temp[i].each do |p|
    @district = District.new
    @district.name = p
    @district.city_id = (i+1).to_s
    @district.save
 end
i += 1
end while i < 23


處理醫院:

rails c 把醫院名字、縣市、區域存進去資料庫
require 'Spreadsheet'    
book = Spreadsheet.open('hospitals.xls')
sheet1 = book.worksheet('婦產科')

#確認資料

sheet1.each do |row|
    break if row[0].nil? # if first cell empty

 puts row[1] + ":" +row[8].to_i.to_s + ":" + row[9].to_i.to_s
 end

#存入資料庫

sheet1.each do |row|
    break if row[0].nil? # if first cell empty

    @hospital = Hospital.new
    @hospital.name = row[1]
    @hospital.city_id = row[8].to_i
    @hospital.district_id = row[9].to_i
    @hospital.save
end


其他Spreadsheet gem的參考資料:
Parse excel file in python and ruby
Getting Started with Spreadsheet

← 用 mechanize gem 去爬 google search page (next page) 用 mechanize gem 去爬 yahoo search page (first page) →